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1 INTRODUCTION 
The System YXV 


The brain of the System xXxXV is a Foonly computer and its memorye 
Associated with this are three disk drivese which store and 
retrieve information on the disks» and a tape drive for reading 
information into the system from tapee In additions the system 
is connected to one or more networks which allow it to 
communicate with users and other computerse These vartous parts 
which make up a System XXV are run by a monitor (Cor operating 
system) called "AUGUST". AUGUST talks to the networks makes 
sure that all the parts of the system are workino properly and 
in harmonye and oversees users® interaction with the programs 
run by the systeme AUGUST also communicates directly with 
System X¥xXV users through a program called ""X=ZC*® and carries out 
many user commandSe 


Purpose and Structure of the Manual 


This manual provides the information necessary to bring up a 
System XXV that has crashede It is divided into several major 
sectionss each of which covers a different situation you might 
encounter after a system craShe Their order ts the same as the 
series of questions you might ask yourself when faced with a 
system that is not operating correctlye We hope that reading 
through the sections in order will enable you to step through 
the process of determining what is wrong with a systems deciding 
how to bring it upe and finally actually doing the recovery 
procedure you have decided one Because the manual is arranged 
jn this working order, fits first several sections deal with 
error conditions and huna systemse Only after these poroblens 
are dealt with can we turn in section 5 to what to do if the 
system has crashede This section discusses recovery procedures 
jin generals the varfous types of recovery available on the 
System XXV» and when to use each of theme. Section 5 ts very 
important’: do not skio ite 


Where applicable and helpfuls each Large section of this manual 
§s divided into four parts: 


Introductione A quick look at the current sectione This 
Will tell you such things as what the section is aboute how 
the information fis organizeds and where to go if it is not 
what you neede 


Summarye 4n outline that briefly presents exactly what you 
need to know or dow This ts meant to be used as a working 
document or for quick reference$ no explanation is includede 


Discusstone A detailed explanation of the information and 
procedures outlined in the Summarye 
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Errors and Recoveriese A List of common problems and 
suagested solutionse 


1e3 Conventions of the Manual 


This manual has a set of conventions that will make ite we hopes 
clear and easy to reade 


Proaram names always appear tn capital lLetterss for examples 
CHECKOISKe 


Special keys on the terminal are indicated by the abbreviations: 
€SP> for space 
<CR> for carriage return 
CESCS> for altmode or escape 
<LF> for Line feed 


Control characters are indicated with the notation "CTRL" and 
surrounded by angle bracketse for examplee CCTRL=X>5 To type a 
control charactere hold down the CTRL key while typing the 
lLettere To type <CTRL=X>s for instances you woulda hold down the 
control key and at the same time type an X Cf fin uppercase or 
LowercaSsede 


The manual will refer to switches on the control panel by 
function in capital Letterse When you Look at the control 
panel,» you will see that the switches are in rows and that 
different rows of switches are tabeled by what they controls for 
example, there ts a row of switches tabeled "micro processor". 
Inside these rows of switches» ftndividual switches are named by 
what they do: for instancey in the row of switches Labeled 
"micro processor", there is a switch named "stop"e This switchs 
which is used to stop the microprocessor, fs called YICRO 
PROCESSOR STOFe 


You put control panel switches "on" by pushing them ups and you 
put them "off" by pushing them down. Some switches are 
momentarye which means that after you put them on (up)de they 
will return to the off (down) position when you release them. 
If you read "Put MICRO PROCESSOR STOP on", this means push ua 
the switch labeled "stop" tn the row of switches Labeled "micro 
processor"e 


Commands appear in two waySe When the command is discussed in 
the texte the first tetter of the command is capitalized and 
there are no quotation markse wWhene on the other hande you are 
directed to enter a specitic commands for example in the Summary 
of a sections the command and its arcument(s)? are lowercase and 
enclosed in quotation markse In this casey type exactly what 
you see excludingse of coursee the quotation markSe If the 
operator’s terminal is uppercase onlys you may type the commands 
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in uppercase’: howevers the reverse is not truee 
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ALL uppercase 


commands must be given that ways do not type them lowercase. 


Some prarams cannot recognize uppercase lLetterSe 
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2 ERROR CONDITIONS |= HOW TO IDENTIFY THEM 
2el Introduction 


When you are faced with a system experiencing some probleme the 
first thine you need to do fs determine what this problem, may 
bee There are three important aids in this precess: error 
messages, BUCHLT numberS» and error Liahtse Always read and 
record the error messageSe BUGHLT numberss and error Lights 
before you attempt to bring the system, upe 


2ec Error Messaqes and BUGHLT Numbers 


When the system crashes» it usually provides an error message on 
the operator'"*s terminal specifying what caused the crashe This 
is followed by a BUGHLT numbere Always read and record error. 
messaoes and PUGHLT numbers when you have a system that is downe 
A List of the various BUGHLT numbers and what they mean comes 
with this manaul. . 


The system may be set so that it does not print the 3UGHLT 
numbere but only prints the word "BUGHLT" followed by the 
Location of the BUGHLT. khen this happense type "e[" to force 
the system to print the numbere The BUGHLT number will be the 
seconds or riahte half of the number printed. 


2e3 Error Lights 


when the system is functioning normallys certain lights on the 
control panel are ons others off. When the system crashes » 
these Lights changee Lights that indicate the system is 
functionina correctly are replaced by error LlLiahtss Lights 
indicating some error has occurrede This sectifion will help you 
tell the difference between Lights Lit during normal operation 
and error Lightse 


Normal Liochts 


When the system is operating normallye a pattern 
consisting of four lights will be cycling amonc the 
address Liahts on the control panele Unless the system fs 
very heavily Loadeds these Lights shoutd be movinge If 
they do not move for a reasonable voeriod of times the 
system is probably hung or downe 


Error Lights 
The following Lights on the control panel are Lit steadily 
only when an error has occurred and the system has crashed 


or will crashe 


MEM FAR =RR Light indicates a memory parity errore 
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MI PAR ERR Light indicates a microcode parity errore 


PROG HALT Linht indicates that the computer has 
encountered a halt instruction in AUGUST» the operating 
systeme Systens programmers occasionally instatl halt 
instructions in AUGUST to help them trace problemSe 
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2 IS THE SYSTEM HUNG OR HAS IT CRASHED? 


Before you can deal with a system that is not operating correctlye 
you must cetermine whether it is hung or has crashede Learnina to 
recognize a hung system is a matter of practicee There ts no one 
sure test that wilt determine if a system is hunge but huna systems 
do have the following common symptomse 


1) Lights on the control panel appear static or are immobile and 
pulsSina in some kind of regular patterne 


2) There is no response when you type <CTRL-C> or <CTRL=T> on 
the operator's terminale 


3) You cannot log in from another terminale 


4) You are receiving irate calls from users who are unable to do 
anythinge 


5) In spite of all thise there is no BUGHLT indicated on the 
operator’*s terminal. 


If a system *s nct functioning and does not have one or more of 
these symptoms, then it has crashede See section Se What to do if 
the System has Crashede 
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4 WHAT TO DO IF THE SYSTEM IS HUNS 
4el Introduction 


The procedure documented in this section will force a hung 
system to crashe This may seem brutal» but it ts necessarye 

The hung system is tin Limbos only after it crashes can you bring 
it back upe khen you finish this procedure and the system is 
downs usé the disk recovery procedure to bring it une 


4e2 Summary 
1) Put address switch 31 on (unde 
2) Put data switch 2 one 
3) Put CONSOLE DEPOSIT THIS momentarily one 
4) Put data switch 2 off. 
5S) Put data switch O one 
6) Put CONSOLE DEPOSIT THIS momentarily one 


7) Wait untit activity tthe flickering of the Liahtse etce) 
stopSe , 


8) Bring the system upd with the disk recovery procedureée 
$e3 Discussion 


when the system is hungs it is trapped in the execution of some 
procesSe The procedure outlined in the above Summary is 
designed to bring the system out of this cycle and cause it to 
crasShe This its desirable because crashina is the normal 
response to abdnormal conditionse When it crashesy the system 
tries to take care of itself -= to save files» to protect the 
monitor, to print an error message indicating what the problem 
may bes and so forthe Furthermorey only after it has crashed 
can the system be brought up. 


Pecause a huno system ignores commands entered on the operator'’s 
terminals to work with ones you must enter the data and commands 
manually from the control panele Put on address switch 31 by 
pushing the switch upe Thenge put on data switch 2e Finallye 
momentarily put on CONSOLE DEPOSIT THISe This process turns on 
bit 2 at address 20 octal in the computerts memorye When you 
turn on this bits you tell the system that everything that is 
storea in the temporary storage area should be read back into 
its permanent Locatione Temporary storage contains all new 
information the system has not read out to its real disk 
Location and also the intermediate results from processes being 
performed but not yet completede Betore torcino the system to 
craSshs you need to make sure that alt this information is safely 
stored in the right place on the diske 
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Now put data switch 2 off, put an data switch 0 and then put on 
CONSOLE DEPOSIT THISe By doing thise you turn on bit O at 
address 20 octale This bit is used by the system to record and 
check Fts status. When bit 0 is offs the system knows it is 
running successfully$ when bit 0 is one it means the system has 
encountered as dangerous situation and should crashe ThuSe when 
you turn on bit O manually from the control panele you trigger a 
system crashe Once all the flickerina of the Lights stooss the 
system ts downe Bring it up with the disk recovery procedures 
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5 WHAT TO 5O IF THE SYSTEM HAS CRASHED 
Sel Introduction 


When a System XXV crashes» before you can bring it up you must 
decide which recovery procedure to us@€@e This section will help 
you do thise It fs divided into three partse The first part 
explains what the System XXV"s recovery prodecures dogs the 
second part briefly describes the recovery procedures availableg 
and the third part will help you decide which procedure you need 
to uS@e In additions we ative advice about what to do if you 
cannot brine up the systems Gnce you know which recovery 
procedure to uses for specific instructionss go to the major 
section describing ite 


5e2 What is a Recovery Procedure? 


The System XXV is operated by a very Large program called the 
"monitor". The monitor is basically what makes a machine into a 
computere It is responsible for checkine the system to make 
sure it is running correctly, transferring information within 
the computer and between it and the outside worlds overseeing 
all the various programs run by the userse keeping the userst 
jobs separate and allocating resources to theme and so forthe 


In keeping with its two functionse runnina the system and 
overseeing the users*® proarams and requestS,>, the monitor ifs 
divided into two partSe The most important part is the 
"resident"~ or "kernel"e monitore The two names of this part of 
the monitor reflect its two major characteristicse "Kernel" 
monitor indicates that this part of the monitor ts the core of 
the systeme Jt contains basic instructions and information 
mecessary for the system to functione For this reasone it must 
always remains or residee in central memorys thuses the name 
"resident" monitore 


The second part of the monitor, the "swappable" monitore 
contains information and procedures related to users*® needs 
rather than system functfonse It is called the "swappable" 
becauseg unlike the resident monitors this part of the monitor 
is not always present in central memorye Instead its various 
parts are copied or "swapped" jnto central memory only when they, 
are needede The Copy Fite to File process is an example of the 
type of procedure Located in swapoable monitore When you enter 
a Copy commands, the system begins by Looking for this process in 
the parts of swappable monitor present in coree If it discovers 
that the Copy process is no Longer im memorys it recalls it from 
cisk and then executes your commande 


The System XXV*s entire monitor is called "AUGUST". AUGUST 
thinks it fis runnina on a PNP10.e Since it is note AUGUST 
depends on another part of the system called the 
"microprocessor". The microprocessor is what translates the 
monitor*s instructions into somethine the System XXV can 
understande Jt consists of a memory containing information 
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called "microcode", and a "microcontroller" that uses this 
informatione When AUSUSTe the monitors gives a PDP10-Like 
instructione the microcontroller takes the instruction and uses 
the microcode to translate it into the equivatent instruction 
for a System XXVe 


In most System XXV crashes the problem is an error in the 
monitore The System XXV*s various recovery procedures are 
designed to replace the old copy of the monitor with a new one 
from disk or tape where copies are permanently storede Some 
crashess howevers destroy not anly the monitor but also the — 
microcode. Since the microprocessor cannot function without the 
microcode, this means that the microprocessor can no Longer 
translate the instructions you or the monitor try to give ite 

In this casey before copyina the monitore the recovery procedure 
must also provide a new copy of the microcode. After the 
microprocessor has this caopy of the microcodes the system is 
given the resident monitore Cnce the resident monitor is safely 
stored in central memorys the system starts running and then 
cortes the sweppable monitore 


After the new monitor ts in memorys in most recovery procedures, 
the system checks the file system with a program called 
CHECKDISKe If everything is OKy the system reports "August in 
operation"e This means the system {is ready to come up and open 
itself for normal usee If CHECKDISK discovers something wrong 
with file systems {t will not come une Insteads it witl wait 
for you to correct the problem. After correctine the probleme 
you will have to halt the system and bring it up againe (CAS you 
will learn in the next sections some recovery procedures allow 
you to avoid this system checkinge) 


It should be emphasitzed that the reocvery procedures are simply 
rrograms like the monitor and everything else that runs on the 
systeme They will not fix any hardward problemse ande in facts 
cannot work if the system has somethina physically wrong with 
ite If you suspect that the system has a hardware probleme or 
you cannot bring it up after trying repeatedlys you may need to 
contact your manager and Tymshare maintenancee 


5Se3 Recavery Procedures Avaitable 
Introduction 


The System XXV has five recovery procedurese ‘None of them is 
particularly difficult» but they do have substantial 
differencese They are divided into two groups: those 
procedures which return the system to normal us@e and those 
procedures which should be used only after very serious 
system error and which do not return the system to normal 
uS@e The Summary below Lists all five procedures and 
mentions one or two of thetr most itmportant featureSe 
Following this is a general discussion of what each procedure‘ 
does and of the procedures relationship to each othere To 
decide which procedure to use after a crashe see the next 
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secticne For details on exactly how eech procedure workss 
see the individual section which discusses ite To learn 
about recovery procedures in generals see section 3e2¢ What 
is a Fecovery Proceduree 


Summary 


Each of the followina three procedures returns the system to 
normal operation and opens it to userSse 


Disk Recoverye You instruct the system to took on the 
disk for the intormation it needs to come upe 
Niscussed In section Ge 


Tape Recoverye You provide the information the system 
needs to come up from tape? providing a new copy of the 
microcode is an optional part of this procedure. 
D4scussed in section 8e 


Automatic Recoverye After crashincs the system 
immediately copies the information it needs from disk 
and tries to bring itself upe This tis done 
automaticallys without waiting for an operatore 
Ciscussed in section 7. 


Poth the next two procedures brings the system up closed to 
normal users and allows systems programmers to investigate 
what is going one Use them only as a Last resorts after 
serious system errorse and under supervision 


Standalone Pecoverye You instruct the system to come uod 
without checking the file system or running the system 
jJobse Discussed tn section Se 


Disk Rebuild. S8efore bringing the system upe you wipe out 
and then rebuild the entire file systeme reading copies 
of every file from tape. Discussed in section 10. 


Riscussion 


When the System XXV crasheSe in most cases you bring it up by 
replacing the old copy of the monitor with a new on@€e You 
can provide this new copy either by copying it from disks in 
which case you are doing a “disk recovery", or by reading it 
from tape for a "tape recovery"e Roth of these procedures 
are beaun by an operator after the system has crashede As 
part of the tape recovery procedure you may also read in new 
copy of the microcode the information used by the 
microprocessor. replacing the microcode jis usually necessary 
only after crashes due to power failure. 


In addition to the tape and disk recovery procedures¢s there 
ts another procedure that the system itself can start up 
after a crashe Since the system begins this procedure 
without waitina for anyone to instruct its this third type of 
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recovery is called "automatic recovery". Automatic recovery 
is very much Like disk recoverye Upon crashinage the systen 
immediately copjes the current contents of central memory to 
disks copies in a new monitor from diske and starts to bring 
itself up. Automatic recovery never occurs unless the system 
was already set for it before crashinge To learn how to set 
a system to recover automaticallys see section 129s Recovery 
SwitchesSe 


Disks tapes and automatic recovery have substantial 
differences -= they are efther automatic or not and the new 
monitor comes from either tape or diske Howevers all three 
have the same result: they all end with the system checking 
itself and the file system and then being opened for normal 
uUS@e The next two recovery procedures doa not have this 
convenient resulte 


Roth standalone recovery and disk rebuild allow the system to 
skip important parts of the normal recovery proceduree “for 
this reason they are very risky: do not attempt them without 
being specifically instructed to do so and without 
Supervision of a systems programmer or managere In 
standalone recoverys the monitor jis read from tape and you 
then direct the system to bypass its normal selfechecking 
procedures and come up CLOSED to userSe This means that onty 
the operator®s terminal has access to the systems no other 
users may Log ine when a system will accept imput onty from 
the operator's terminale it is said to be "standalone". The 
standalone recovery procedures takes its name because it has 
this effecte 


Even more serious than standalone recovery is disk rebuild. 
Disk rebuild allows you to do just what you micht suspect 
from its name -= rebuild the file system stored on the diske 
As in standalone recoverys you begin by reading in a tape 
containing a new monitore Thensy before the system needs to 
use any information from the disks you begin the disk rebuild 
procedure. A disk rebuild involves destroying atl the 
current versions of every files and returning to the verston 
stored on dump tapes; normallye it should NEVER be usede 


5e4 Decidina Which Recovery Procedure to Use 
Introduction 


When a System XXV is downe whether for the first time in a 
month or minutes after a previous crashe the first step in 
brincing it up is deciding which recovery procedure to uSé€e 
This section will help you make this chofcee It is divided 
into two partse The Summary contains a taodle showing types 
of crashes and their recovery procedurese The Discussion 
exyolains the Logic behind the table? ft will tell you why a 
particular recovery procedure is used in a certain set of 
circumstancese Once you determine which recovery procedure 
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you needs, ao to the section discussing it to learn how to use 
ite 


Summary 


The table below shows when to use each of the System XXV%s 
five recovery procedurese The left column Lists different 
sjtuations you miaht encounters the riaht column shows the 
recovery orocedure you should usee Vote that you cannot 
decide to use automatic recovery after the crash has 
occurrede (Autcmatic recovery means that the system will try 
to brina itself up after a crash without waiting for an 
oreratore) For automatic recovery to occurs the system must 
be set for it before the craShe 


Situation Recovery Procedure 
Crash NOT OUE to power failure Disk Recovery 

Hune system is forced to crash Nisk Recovery 

Any "normal" crash Nisk Recovery 

After CHECKDISK problems corrected Disk Recovery 
Recovery begins automatically Automatic Pecovery 
Crash DUE to power failure Tape Reccvery 

Disk recovery fails Tape Recovery 
Automatic recovery fails Tape Recovery 

Tape recovery fails repeatedly Standalone Recovery 


FIRST» contact manager 
or systems progrannmer 


Entire file system destroyed Disk Rebuild 
FI®ST,s contact nanrager 
or systems programmmer 


Discussion 


The System XXV has three standard recovery procedures: disk 
recovery, tape recovery, and automatic recoverye In addition 
to these, there are two more risky recovery procedureses 
standalone recovery and disk rebuilds which should not be 
used without your managerts approvale This large number of 
choices means that you have more flexibility in responding to 
a crash» but it also means that you have more choices to 
make@ee Before you can bring up a system that is downe you 
must decide which recovery procedure to uS@e To maeke this 
decisione you must consider? 1) how the system will respond 
when it encounters an error while running? 2) the 
circumstances of the crashe what caused jite and whet effect 
it had. 
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When a System XXV crashese the first thing it coes is check 
four internal switches, called "recovery switches". Recovery: 
switches are four menory Locations whose values tell the 
system how to respond to a crashe Recovery switches 
explained in section 123 here it fis enough to know that they 
will tell the system to do one of two thinas: stop and wait 
for an operators or immediatety begin the automatic recovery 
proceaure and try to come upe You should know how the 
recovery switches of each system have been set so you will 
know how it will respond to a crashe If you see a system 
crash and do not know what fit will dos watch the systems 
until you know whether it is eotng to come up automaticatly 
or you need to begin a recovery proceduree If you see a 
system crash which you know is set for automatic recovery, it 
fs wise to keep an eye on it and make sure jit really does 
begin the automatic recovery proceduree Recovery switches 
are occasionally destroyed in system crashese When this 
happense a system originally set to recover automatically 
will simply sit there waitinge 


If a system does manage to beagin an automatic recovery, you 
have at first no decisions to makee If all aoes welle the 
system will come back up and you will not need to do 
anythinge Howevers automatic recovery does have two 
pitfatlse Firsts the recovery may not be successful and the 
system may hang or crash againe If you notice this 
happenings do not Let another automatic recovery begins: the 
procedure is hardly likely to succeed on a second trye 
Insteads halt the systems if necessarye and bring it up 
yourself with the backup recovery procedures tape recoverye 
The second problem that can keep the system from ceming all 
the way un is errors fn the file systeme As the system comes 
ups it uses a program called "CHECKDISK" to examine the disk 
and make sure the files are OKe If CHECKDISK discovers 
problems» the system will stop to wait for someone to correct 
theme. After correcting the errorse you will have to halt the 
systeme and bring it up with disk recoverye 


If, after a crashe a system does not try to come up 
automatically but instead just sits there waitings then you 
must take over and begin some recovery procedur@e Your 
choices at this potnt are disk recovery and tape recoverye 
Of theses disk recovery Ys more convenient and should be 
tried firste simply because it is so easy to usee Howevers 
disk recovery requires that the procedure itself survive the 
crash and that EDDT be available to begin ite Both of these 
are part of the old monitore Crashes due to oower failure 
destroy the old monitor ¢€and microcode)? completely and thus 
wipe them oute If you think the crash was the result of 
power failures do not use disk recoverys insteads try tape 
recoverye 


If you do decide to start with disk recovery and the system 
succeeds jin running CHECKDISKs this means ECOT and the 
recovery procedure itself are GKe Even if CHECKDISK 
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discovers file problems and the systen does not come uodg you 
may again use disk recovery, after correcting the problems 
and halting the systeme File problems do not indicate that 
anything is wrong with the actual recovery procedures 
However, if the first time you try disk recovery the system 
dees not get as far as running CHECKOISK -= the recovery 
procedure never starts or the system crashes or hangs == you 
will know that either EDOT or the disk recovery procedure or 
both did not survive the crashe In these circumstances, it 
is a waste of time to try the procedure a second timee 
Insteads halt the systems if necessarys and switch to tape 
recoverye 


Tape recoverys the last of the three "normal" recovery 
procedurese is the backup proceduree In tape recovery the 
system copies the information it needs from tape$ thuSs» 
recovery does not depend on any part of the system being able 
to functione Instead you enter all commands to the system 
throuch the control panele However, although tape recovery 
js the most reliable of the recovery procedures, it ts also 
the most time-consuming and inconveniente Do not try tane 
recovery if you think disk recovery will worke 


If you must use taoe recoverys because the power failed or 
disk and automatic recovery do not worke feel free to try it 
several timese If you are using it after a power fajlures or 
if nothing happens when you try to read the monitor taney 
beoin the procedure by reading in the microcode tapee If 
recovery does starts but the system never reaches the point 
of running CHECKDISKe halt the systems if necessarys and try 
the procedure overs again beginning by reading in the 
microcode tapee Ife after trying the recovery three times 
from beginninuge the system still does not run CHECKDISKs 
something may be seriously wronge Notify a systems 
programmer or ycur manager$ they may want to try standalone 
recoverye Once CHECKDISK does rune even if CHECKDOISK 
discovers problems with the files systems the new monitor is 
in memory and this part of the recovery procedure has been a 
succesSe In additione since ENDT and the disk recovery 
procedure are part of the monitore they are again availablee 
If CHCCKDISK detects file problems and you must halt the 
system after correcting thems you may use disk recovery to 
brina the system back upe 


Standalone recovery is the procedure used when some problen 
with the file systeme CHECKDISK, the system jobse and so 
forths is causing the system to crash after the monitor is 
read into memorys but btefore it can come up all the way and 
return to normal usee In standalone recoverys after the 
monitor is cosied from tapes the system simoly stoos where it 
is and waitSe A systems programmer can then exramine the 
monitors the file systeme and so forthe and try to determine 
what is wronge A standalone recovery is not hard to doe but 
since the system comes up without checkina how it ifs 
operating or making sure the file system is Goode areat harm 
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may be done by mistakee Never undertake standalone recovery 
without expert supervisione 


The final type of recovery procedures disk rebuilds should be 
used only after a systems programmer has determined that a 
crash nas damaged the file system beyond all hope of rep3ire 
Disk rebuild allows you to bring up the system in such a way 
that before anything is needed from the diske the entire file 
system is replaced with backup files from tapee S8ecause 
files can be replaced only with their most recent bdackupses 
the most current versions of many files will tbe permanently 
loste The decision to do a disk rebuild can be mace only by 
a manager or a systems proorammners and we hope the procedure 
will never have to be usede 


5-5 Difficulty Brincing Up the System 
Tf you cannot bring up the system or feel that something 


mysterious iS going one call Tymshare Maintenance or an OAD 
operating systems programmere 
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& OISK RECOVERY 
Eel Tntroduction 


You should try to bring the System XXV up with the disk recovery 
procedure after any crash that is not due to power failuree The 
procedure is quick and easy$ howevere it relies on part of the 
monitor survivina the crashe This means it may not always worke 
Do not try to use disk recovery more than ONCEe If your first 
attempt at bringing uo the system with. disk recovery fails 
betore CHECKDISK is runs you must halt the systems if necessarys 
and then switch to tape recoverye If CHECKDISK does runs then 
the system*s new monitor is in place and this part of recovery 
has been successfule Even if CHECKDISK finds protlems with the 
file system and you must halt the system after takina care of 
theme you may aaqain use disk recovery to bring the system back 
uDe TO correct problems found by CHECKDISK and to halt the 
system, see section 13, Related Procedurese 


Ge2 Summary 


1} Check the EUGHLT number on the operator’£s terminal and Look 
it up in the List of BUGHLTS$S In additione note which error 
Lights are Lite. Recora all this informatione 


2) In the operator*®*s terminals type "dskrld<ESCoq". 


33 The revonse should be "reloading from disk". If you never 
get this messasee begin a tape recoverye 


4) When the system saysSe "BOOT FROM DISK PACK # CCR FOR ANY" 9 
type <CR>. The system will begin to copy the monitor and 
will record its progress in messageSe 


5) When thé operator’s terminal says "ENDT"» type "*start<l Stooge 
After a short times the system should report the size of the 
memory and print several messages about BAT blockse 


7 CHECKDISK wilt run and check the file systeme If it finds no 
major errors, it will report the number of disk pages used 
and the number availablee If bad files are discovereds they 
will be Listed and the system will announce "August not in 
operation". 


8) If CHECKDISK runs successfully, the system witl announce 
"Auaust In operation" and ask for the date and timee ‘fFnter 
these jin the form ND-MONWYY<SPOHHSMMs follow with <CR>e The 
system jobs will Log in automaticallye 


Sy When the system promots you with "2", the promot for CXECe 


Log tin by typing “oper<SP>password<SP>o<CR>"_ where password 
stands for your passworde 
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16) After the system prints various messages and prompts with 
you another "a", type "ena<CR>". The response will be a new 
prompt, "I, 


11) Type "ref <SP>acCR>". 
 6e3 Discussion 


When the System XXV ecraShese the first thing it does is check 
its recovery switchese (For information on recovery switchese 
see section 129 Recovery Switchese) If the recovery switches do 
not tell the system to come up automaticallys the system simply 
stops and waits for someone to tell it what to do nexte You now 
need to step in and bring the system up by providing a new copy 
of the monitore In disk recoverys you do this by telling the 
system to get a new copy of the monitor from diske where it js 
‘permanently storede Bisk recovery thus saves you the 
inconvenience of findina and loading the monitor tape and 
switching all the switches on the control panele Howevere it 
wilt not always worke To start the copying procedure, you must 
use part of the old monitor called "ENDT",. FEDDT escapes nost 
crashes without harm$ howevers crashes due to power tailure 
always destroy ECODT and sometimes other crasheS» for examples 
those due to power surges» will also damage ite If you suspect 
that the crash was due to power failures do not try to bring up 
the system with disk recoverys use tape recovery insteade 


You begin disk recovery by typing "dskrld<fSCog"*. "Pskrtd* 
stands for "cisk reload"$ ft is the name of a Location in the 
system*s memorye This tocation is the beginning of the disk 
recovery procedures a program that copfes a new monitor from a 
file stored on the diske When you type "dskrld¢<ESC>q", you tell 
EDOT to ao to this procedure and begin running the program found 
theree When the procedure beainse ft prints "reloading from 
disk". tf this message never appearS, it means DOT was wiped 
out by the crash and you cannot reach the disk recovery 
procedure. In this cases begin a tape recoverye 


If control is successfully transferred to the disk recovery 
procedure, the procedure first moves itself to a special snot in 
memorys beginnina at Location 3000,59 and makes room for the new 
monitor by clearing the rest of central memorye The system next 
needs to know where to should Look for a new monitore Each disk 
has a copy of the monitor stored ina file named 

CSYSTEMSMONITOR ePACK=x$le where x stands for the cisk numpere 
The monitor file on disk pack Oy for examples 4S named 

MONITOR ePACK"“031.6 To find out which disk it should check for 
the new monitors the system will ask "BOOT FROM DISK PACK #4 ECR 
FOR ANYJ™".e The standard answer here is <CCR>e This tells the 
system to start by looking on disk pack 0 for the files if it is 
not theres Look on disk pack le» and finally check disk pack 2e 
To tell the system to check only a particular disk packs instead . 
of answering the question with <CR>-« cive the number of the 
packe If the system cannot find a good monitor files it will 
print out "FAILED TG READ RESIDENT MONITOR". Since disk 
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recovery cannot work without reading the resident monitor fron 
disk, you will have to halt the system with Method 8 in section 
13.5 and then use tape recovery to bring the system upe 


If the system does find a usable copy of the monitor filee the 
disk reload procedure copies a new resident monitor from disk 
into the cleared memorye The system will inform you of its 
progress with various messaceSe Since you still dontt really 
know if this procedure escaped the crash without harms it is 
wise to keep an eye on these messagese. If anything goes wrong 
before the system runs CHECKDISK and reports on the status of 
the file system, it means the procedure is unusablee If the 
system hangs as it comes ups halt it with Method E documented in 
section 13.5 of "elated Procedures" and bring it uo with tape 
recoverye If the system crashes againsys begin a tape recoverye 


After the resident monitor is in memorye the system will go into 
EDDOT and print "EDDT" on the operator*s terminale When you type 
"Start<zcSCoaq"-_ you transfer control to the Start procedure. 

This procedure starts up the rest of the recovery procedure and 
coptes the swappable monitor from the second part of the monitor 
file. 


As the system conies the new monitors the old settings of the 
recovery switches are replaced by the default switch settings 
that are part of the new monitor. These default setting ares 
DBUGSW = le CCHKSW = Oe RELDSY = 19 and COM>Si = le This tells 
the system that after crashina it shoutd stop and wait for 
instructions on what to do nexte Once the system is ups you may 
change these default switch settinas with the procedure 
documented in section 12¢ Fecovery Switchese That section also 
explains reccvery switches tn general. 


Mnce the system*s new monitor is in places the remainder of disk 
recovery is exactly the same as tape recoverye Thuse the 
following explanation is identical to the Last part of the 
Discussion in the section on tape recoverye This explanation is 
Included here for your convenience: if ycu are already familiar 
with tape recoverye you do not need to read furthere 


Now that it has its new monitore the system turns its attention 
to the memory and file systeme It first reports on the size of 
the memory and tells you about the BAT blocks. "fAT" stands for 
"Rad Address Table", BAT blocks contain tables that are used to 
keep track of what parts cf the disk are bad and thus should not 
be usede Once it is determined what parts of the disk are bad 
and should not be used for storagee the system runs a program 
called "CHECKDISK". CHECKDISKe as the name indicatese checks 
the disks and the integrity of the file system It makes sure 
that no section of the disk is allocated to more than one file 
and that allt file addresses are valide If CHZICKDISK discovers 
errors in the file systems it Lists the bad filese anc the 
system stops and waits for you to correct theme In this case@e 
the system will not be able to come ups to Let you know what 15s 
happening it will announces “August not in operation". For more 
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information on CHECKDISK and instructions for correcting the 
errors it detects, see section 13-29 Correcting Problems Found 
by CHECKCISKe 


If CHECKDISK finds no serious file problemss it reports on disk 
us@s and thene once it is finishede the systen will announces 
"August in operation". At this points the system is completely 
ready to come up and open itself to users. It needs only two 
more things trom yous the date and timee When the system 
directs you to enter the current date and timSe type two nunbders 
for the days a dashes the first three letters of the months a 
dashe and then two numbers for the yeare Fotlow these with a 
space and then give the timee on a 24 hour basiss as two numbers 
for the houre a colone and then two numbers for the minutes$ be 
sure to give the correct timee Follow all this with a carriage 
returne For examples you would enter the date March °%, 1981¢ 
and the time 5304 pme as "09—emar-81<SPSI7204<CCR>"-. If you enter 
the wrong date and times finish the recovery procedure and then 
correct your mistake as documented in section 13e6,s Changing the 
Jate and Timee 


After you have entered the date and time the system is 
officially upe The system jobs will now Log in automatically 
and you will be prompted with "3", the prompt for EXECe This is 
an invitation to Log ine Log in as an operator by typina 
Foper<SP>opnassword<SPOCcCR>"_ that ist "oper" (for operator)e a 
spac@€e your passwords a Spacey and then a carriage returne In 
the interests of secrecye your password will not printe After 
you have logged ine the system will norint various messaages and 
another "4", Type "ena<CRh>". This stancs for "enable*® and 
tells the system to altow you to perform operations denied the 
normal usere Once you have "enabled", or identified yourself to 
the system as a person with special powerse the system will 
change its prompt to "!". Now refuse automatic logout by tyoitng 
Pref<SP>acCR>"™. AUGUST normally Logs out users who Leave their 
terminals idlee 


6e4 Errors and Recoveries 
Nothing happens when you type "dskrld<ESC>g" 


If nothing happens when you type "dskrld¢cESCog%e this means 
the disk recovery procedure cannot be usede Instead» use the 
tape recovery procedureée 


The system cannot find a monitor file 


If the system cannot find the monitor filee it will tell you 
PFAILED TO READ RESIDENT MONITOR". If you have told the 
system to look on a2 snecific disk for the monitore halt the 
system (with Method B of section 13-5)¢ try another disk 
recovery and tell the system to Look on a different disk for | 
the monitore Ife after checking them all Ceither by typing a 
€<CR> or individually aiving the number of 2ach disk)e you 
discover that none of the disks have a good copy of the 
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monitor files you cannot use disk recoverye Halt the systems 
if nmecessarye and brina it up with a tape recoverye 


Errors before CHECKDISK reports on the file system 


If the system hangs or crashes before CHECKDISK reports on 
the status of the file system and you never get the message 
"August in operation" or "August not in operation"e¢ recovery 
will not be successful. Bring up the system with the tape 
recovery procedureée 


CHECKDISK discovers problems with the file system 


If CHECKDISK finds anything wrong with the file systeme the 
System XXV will stop and wait for you to correct the 
problemse It cannot come uo while somethina is wrona with 
the file systems the risk of destroying files ts too greate 
For directions on how to correct any problems CHECKDISK 
finds» see section 13e2¢ Correcting Problems Found by 
CHECKDISK.e 
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7 AUTOMATIC RECOVERY 
7el Introduction 


How a System XXV responds to a crash is determined by its 
recovery switchese Recovery switches are explained in section 
12—9 Recovery Switchese If they are set for automatic recovery 
REFORE a crashe thene after one occurss the system should 
immediately try to bring itself up with the procedure documented 
here. This is convenitente since you do nct have to start a 
recovery procedure every time the system crasheSe Howevers do 
not assume that systems set for automatic recovery will never 
need your help. If the crash was due to power failure or jit 
damaged memorye automatic recovery witl never beains the system 
will simply sit there and you will have to begin a tare 
recoverye If the recovery procedure was somehow damaged in the 
crashe the system may start to bring itself un and then hang or 
crash againe In this case tooe you must step ine halt the 
Systems if necessarye and use tape recoverye Even if automatic 
recovery begins and gets as far as running CHECKDISKe success is 
not cuaranteede If CHECKDISK detects problems in the file 
systems automatic recovery can proceed no further. After 
correcting the file problems and haltina the system (both 
documented in section 13-9 Related Procedures), you wilt have to 
use disk recovery to prina the system upe 


7e2 Summary 


If a System XXV set for automatic recovery comes up ; 
successfullys you do not need to do anything until you log in as 
an opéeratore If recovery never startse if it fails before 
CHECKDISK reports on the file systeme or if CHECKDISK discovers 
bad files» see "Crrors and Recoveries" in this section for 
instructionse 


1) The system will begin to bring itself ups’ various messages 
will recoro its progresSe : 


2) CHECKOISK will run and check the file system. If It finds no 
major errors, it will report the number of disk paces used 
and the number available. If bad files are discoveredsy they 
will be Listed and the system will announce "August not in 
operation". 


3) If CHECKDISK runs successfully» the system will announce 
"fuagust in operation".e The system ts now ude The system 
jobs will log in automatically. 


4> When the system prompts you with "d"_, the prompt for EXECS, 
Log in by typing "oper<SP>password<SPo<CR>"— where password 
stands for your passwords 
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5) After system prints various messases and prompts with you 
another "@"_ type "enacCR>". The response will be a new 
prompt, "I", 


6) Type "ref<SP>acCROo". 
Te3 Niscussion 


When a System XXV crashess the first thing it does 4s check its 
recovery swWitchese the Switches which tell #t what it should do 
nexte Recovery switches and their vartous settings are 
explained in sectton 129 Recovery Switchese One setting of the 
recovery switches will tell the system to come up automaticallye 
Ife after a crashs the system discovers that the switches are 
set in this waye it will immediately start to bring itself pdack 
up without waiting for an operatore 


There ares howevere several problems that can stop systems from 
coming un automaticallye First of alle the system may never 
find out that it was supposed to do this. In crashes due to 
power failure and those that damage memorye the recovery switch 
settings may be tost or never checkede Consequentlys a system 
you think is set to come up automatically will note Insteade it 
will wait for you to begin a recovery proceduree just as it 
normally does after a crashe Keep an eye on all systems set for 
automatic recovery: if you see a system that appears to be down 
and not trying to come up» you will have to use the tape . 
recovery procedure to bring it upe Do not try to use the disk 
recovery procedure; it too witl be Lost along with the recovery 
switchese 


If the system does remember its recovery switch settings and try 
to come up automatically, it usually begins by copying the 
current contents of central memory into two filese The first 
‘512 pages of memory are stored in a file called 
C<SYSTEMSCCROMPeLOW and the second 512 pages are stored ina file 
called <SYST©MS>CORDMPeHGHe. These files are used by systems 
orogrammers to find out what was in central memory right after 
the crashe If you do not want the system to dother with thfs 
copyinge you may set the recovery switches so that ft will not 
be donee See section 12,9 Recovery Switchese 


Snece the contents of core have been safely stored in the CORDMP 
files, the system next needs a new copy of the monitore The 
system copies the monitor from disk with a procedure very much 
Like the disk recovery proceduree The procedure prints messaxes 
to help you fcllow its progresse [It is a very good idea to read 
these messages and make sure recovery is progressing 
successfully. Sven atter the recovery procedure startse things 
can still go wronse If the procedure was damaged by the craShe 
the system may hang as it comes up or may try to come upe faile 
and crash againe After crashings the system would once more 
check the recovery switches, discover it should come up 
automatically»e take another core dumps and try to come upe A&s 
4t tried to come upe the system would encounter the same problem 
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and crash againe The system could thus get caught in a Loop of 
crashings trying to come upe and crashina againe If you notice 
a system set for automatic recovery that appears to be hung or 
having some kind of troubles, watch ft for a whilee If it never 
aets to the point of runnino CHECKDISKs halt the system with 
Method 6 of section 13¢5¢e Halting the Systene Then switch to 
tape recoverye 


If all goes well with the automatic recovery porocedures it will 
announce "reloading from disk"» move itself to a spectal place 
in memorys starting at Location 3000 and clear the rest of coree 
A new resident monitor tis then copied from a file where #t is is 
permanently stored on the diske After the resident monitor is 
read-~ the system starts up and transfers control to the Start 
proceduree This very much like the procedure you get when you 
cive the Start command in the disk recovery proceduree The 
Start procedure starts up the rest of the reocvery procedure and 
copies the swappable monitor from the s2cond oart of the MONITOR 
file. 


Now that it has its new monitors, the system turns #ts attention 
to the memory and file systeme It first reports on the size of 
the memory and tells you about the BAT blockse "BAT" stands for 
"Bad Address Table"e BAT blocks contain tables that are used to 
keep track of what parts of the disk are bad and thus should not 
be usede Once fit is determined what parts of the disk are bad 
and should not be used for storages the system runs a program 
called "CHECKDISK". CHECKDISKs as the name indicatese checks 
the disks and the integrity of the file system It makes sure 
that no section of the disk is allocated to more than one file 
and that all file addresses are valide If CHECKOIS« discovers 
errors in the fite systeme fit tists the bad files» and the 
system stops and waits for you to correct theme Thuse if 
CHECKDISK discovers problems, the system cannot come all the way 
up automaticallye Insteacde the system will announcee "August 
not in operation" you must take over and fix the file problems 
CHECKDISK has founde For instructions on how to do soe see 
section 13e2s Correcting frrors Found by CHECKDISKe 


If CHECKDISK finds nothing wrong with the file systeme it 
reports on disk uS@e and thene once it is finished» the system 
will announces “August in cperation". At this potnt, the system 
is uPe Notice that you are not required to enter the date and 
time as you must do to end the disk and tape recovery 
procedurese During automatic recoverys unltke the other 
recovery procedures, the system*s internal clock continues to 
rune To Learn the correct times the system simply uses it 
instead of asking youe In addition to using the system*s clock 
to find the times the end of the automatic recovery procedure 
ditfers in another way from disk and tape recoverye During 
automatic recoverye the system saves the orioinal recovery 
switches settingse then recovery 48 overe these settings are 
restored and replace the default switches settings that are read 
in as part of the new monitor. This means that after an 
automatic recovery the recovery switches continue to be set for 
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automatic: next time the system crashes jit will acain try to 
bring itself up automatically. 


Once the system has found out the time and come all the way uDde 
the system jobs can log in automatically and you will be 
prompted with "a", the herald for EXEC. This is an invitation 
to Log ine Log in as an operator by tyoing 
"oper<SP>password<SP>o<CRO"» that ist "oper”™ ¢€for operetor)s a 
spac@és your password, a spaces and then a carriage returne In 
the interests of secrecys your password will not printe After 
you have logged ine the system will print various messages and 
another "a". Type "ena<ccCP>",. This stands for “enable® and 
tells the system to allow you to perform operations denied the 
normal usére Once you have "enabled"~» or identified yourself to 
the system as a person with special powers, the system will 
change its prompt to “!". Now refuse automatic logout by tyoing 
"ref<SP>acCR>".- AUGUST normally logs out users who leave their 
terminals idlee 


7Te4% Errors and Fecoveries 
The System never begins to bring itself up 


If a system that ts supposed to be set for automatic recovery 
never announces "reloading from disk" to show fit has begqure 
bring the system up with tape recoverye 


Frrors before CHECKDISK reports on the file system 


If the system hangs or crashes before CHECKDISK reports on 
the status of the file system and you never get the message 
"Auaqust in operation" or "August not in operation", recovery 
will not be successfule SBring up the system with the tape 
recovery procedure€e 


CHECKDISK discovers problems with the file system 


If CHECKDISK finds anything wrong with the file systems the 
System XXV will come up automaticallys the risk of destroying 
files is too greate Insteads it will stop and wait for ysu 
to correct the problemse For directions on how to correct 
any problems CHECKDISK findse see section 13629 Correcting 
Problems Found by CHECKDISKe 
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3 TAPE RECOVERY 
Bel Introduction 


Tape recovery is basically a backup recovery procedureée [t 
should be used after automatic or disk recovery has failed or 
after a crash dus to power failuree Tape recovery jis normally a 
straightforward and not particularly difficult procedure, but it 
can grow rather complicated» particularly if the crash has 
samehow damaged the file system or wreaked other havoce If you 
encounter problems while bringing the system ups check section 
Re4y Errors and Recoveries -= you may find the solution to your 
problem theree If your problem is not covered in Crrors and 
Recoveriese try the whole procedure overe starting from step 3-6 
If this does not work, something may be serjously wronge Notify 
your manager’: he or she may want to try Standalone Recovery org 
aS a Last resorte Disk Rektuilde 


8e2 Summary 

1) Check the PRUGHLT number on the operators terminal and Look 
it up in the List of BUGHLTsSs in additions nete which error 
Lights are Lite Record all this informatione 

2) If the power has gone cff>» you must reload the microcode as 
documented in steps 3 through 12. If the power has not gone 
offs skip to step 136 

3} Mount the microcode tape on the tape drivee 

4) Put all switches on the control onanel off (downd.e 

5) Put address switch 32 on Cupde 

6) Put MICRO PROCESSOR STOP One 

7) Put MICRO PROCESSOR MIPC one 

8) Put MICRO PROCESSOR CLR momentarily one 

9) Put MICRO PROCLESSOR CONT momentarily one 

10) Put MICRO PROCESSOR MIPC offe 

11) Put MICRO PROCESSOR STOP offe 

12) Put MICRO PROCESSOR CONT momentarily one The tave should 


spin and then stone femove the microcode tape from the tape 
drive. 
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13) You are now ready to read in the new monitore “ount the 
monitor tape on the tape drivee (CStart here if you do not 
want to Load the microcodee] 

14) Put all switches on the control panel off. 

15) Put address switches 24 and 26 one 

16) Put MICRO PROCESSOR STOP one 

17) Put MICRO PROCESSOR MIPC one 

18) Put MICRO PROCESSOR CLR momentarily one 

19) Put MICRO PROCESSCR CONT momentarily one 

20) Fut MICRO PROCESSOR MIPC offe | 

21) Put MICRO PROCESSOR STOP off. 


22) “omentarity put MICRO PROCESSOR CONT one The tape should 
spin and then stope 


23) Fut address switches 24 and 26 offe 
24) Put address switches 29 and 30 one 
25) Momentarily put CONSOLE START on twicee 


26) When the operator’s terminal says "ENDT"» type 
"start<ESC>og".- The monitor tape should spine 


27) Remove the monitor tape from the tape drive. 
¢ 
28) “tfter the system reports the size of the memorye put MI PAR 
ERR STOP and MEM PAR ERR STOP one “The system will orint 
several messaces abdout BAT blocks. 


29) CHECKDISK will run and check the file systeme If it finds 
no major errorss it will report the number of disk pages used 
and the number avajilablee ff bad files are discovereds they 
will be Listed and the system will announce "August not in 
operetion". 


30) Tf CHECKDISK runs successfullye the system will announce 
"Auaust tn operation" and ask for the date and timee =nter 
these in the form DD=MON“-YY<SPSHHEMMS follow with <CR>. The 
system jobs will Loo in automatically. 


31) tthen the system prompts you with "@"-_e the prompt for EXECes 
Log in by typina "oper<SP>password¢eSP>o<CR>"» where password 
stands for your passworde 
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32) After system prints various messages and srompts with you 
another "&"»_— type "ena<CR>". The response will be a new 
prompt, "fF", ; 


33) Tyre "ref<SP>acCRo"e 
Ciscussian 


Although you can brings System XXV up after most crashes simoly 
by providina it with a new monitors this ts not always true. 
Some crasheSe especially those due to power fatlurese wipe out 
not only the monitore but also the microcodee Since the 
microcode tis the information the microprocessor uses to 
translate the monitor instructfon into something fit can 
understands when this happens you must begin recovery by reading 
in a tape containing the microcodee Only after the mocrocode is 
available can the system correctly read the monitor tapee 


Since a system that does not have access to microcode or any 
part of the monitor cannot understand any instructions aiven oan 
the operator's terminals to reloaa the microcode you must give 
the system instructions from the control panele Mount the 
microcode tare on the tape drive and put address switch 32? on by 
pushing the switch upe Alsoe find the row of switches Labeled 
"MICRO" and put the MICRO PROCESSOR STOP and MICRO PROCESSOR 
MIPC one You then briefly put on MICRO PROCESSOR CLR followed 
by then MICRS PROCESSOR CONTe This process tells the 
microprocessor that the address specified through the address 
switches is where it should took for instructions on what to do 
next. The address you specify by putting on address switch 32 
is address 10 octale This is the besinning of a tane-reading 
routine which is permanently stored in the memory of the 
microprocessoOre You now want to tell the microprocessor to 
execute this routine and read the tape containing the microcodee 
You do this by putting off MICRO FROCESSCOR MIPC and MICRO 
FROCESSOR STOP and puttinae on MICRO PROCESSOR CONTe 


Once the system has read the tape containing the microcodee it 
has all the information necessary to read the first part of the 
monitor tapes which contains the resident monitore The 
procedure for readina this tape #s identical to that for reading 
the microcode tapes EXCEPT that you specify a different address 
with the address switchese First put off all the switches on 
the control panele then put on address switches 24 and 25,9 put 
on MICRO PROCESSOR STOP and MICRO PROCESSOR MIPCe and finallye 
acain momentarily put on MICRO PROCESSOR CLR and MICRO PROCESSOR 
CONTe This tells the system it should beaqin executine the 
instructions at address 5000 oct&éle the address svecified with 
address switches 24 and 26e Address 5000 jis the heginning of 
instructions for reading the monitore To execute these 
instructionss put MICRO PROCESSOR MIPC and MICRO PROCESSOR STOP 
off and again momentarily put on MICRO PRCCESSOR CONT. The 
microprocessor will read into memory the first part of the 
monitor tape? this contains the resident monitore 
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Nnce the resident monitor is tn memorys you want to start it 
runninae When you put on address switches 29 and 30 and then 
hit CONSOLE START twicee you tell the system to ao to location 
140 and start runnine the procedure it tinds theree The 
procedure beaqinning at this Location brinas the system alive and 
starts up the resident monitore Because -DDT is part of the 
resident monitore the system can now ao into EDDT and witl print 
FPEDDT* on the operator’s terminal to notify youe Since ENDT can 
understand typed commandse you can start up the rest of the 
recovery procedure by tyning "start<esce>s”" on the onerator’®s 
terminale The system witl begin running and read in the 
swappable monitors the second part of the monitor tapee 


When the system copies the new monitors, the old settings of the 
recovery switches are changed to the default switch settings 
that are part of the new monitor. These default settinas are? 
DBUGSW = le DCHKSWH = Oe RELDSW = Ie and CDM>SW = 1-e This tells 
the system that after crashing it should stop and wait for 
instructions on what to do nexte Once the system 3s ups you may 
change these detault switch settings with the procedure 
documented in section 12¢ Recovery Switchese That section also 
explains recovery switches in generale 


Now that it has its new monitors the system turns its attention 
to the memory and file systeme It first checks how much memory 
is physically available and reports on the size of the memory. 
Putttina on MY P4R FRR STOP and MEM PAR ERR STOP tells the system 
to stop it a parity error Ys encountered fin central memory or jin 
the microcodee The system will next tell you about the.BAT 
blockse "BAT" stands for "Bad Address Table". BAT blocks 
contain tables that are used to keep track of what parts of the 
disk are bad and thus should not be used. Once it ts determined 
what parts of the disk are bad and should not be used for 
storage, the system runs a2 program callec "CHECKDISK". 
CHECKDISKs as the name indicates» checks the disks and the 
integrity of the file system It makes sure that no section of 
the disk is allocated to more than one file and that all file 
addresses are valide If CHECKDISK discovers errors in the file 
systeme it Lists the bad filese and the system stonos and waits 
for you to correct theme In this case» the system will not be 
able to come upe Instead, after CHECKDISK runss the system will 
announces "Auaust not in operation". For more information on 
CHECKUTSK and tnstructions for correcting the errors it detects, 
see section 1362s Correcting Problems Found by CHECKDISKe 


If CHECKDISK finds no serious file problems, it reports on disk 
us@» and thene once it is fitnisheds the system wilt announces 
®August tn operation". At this pointe the system fs comoletely 
ready to come up and open itself to userse Tt needs only two 
more thincs trom you, the date and timee When the system 
directs you to enter the current date and tims, tyne two numbers 
for the daye a dash, the first three Letters of the monthe a 
dashe and then two numbers for the yeare Follow these with a 
space and then give the times on a 24 hour basise as two numbers 
for the houre a colone and then two numbers for the minutess de 


8 Tape Recovery Page 30 


sure to give the correct timee Follow alt this with a carriage 
returne For examples you would enter the date March So» 1981, 
and the time 5:04 pme as "O9=marw-E1l<CSPI1LTS04<CRO".- %$If you enter 
the wrong date and time, finish the recovery procedure and then 
correct your mistake as documented in section 1366, Changing the 
Mate and Timee 


After you have entered the date and time the system is 
officially une The system jobs will now log in automatically 
and you witlt be prompted with "a@"»_ the. orompt for EXECe This is 
an tnvitation to log ine Log in as an operator by typing 
Poper<SP>opassword<SPOoOCCRS"_ that Fst "oper" (for cperator)ds a 
space€e your passwords a@ spaces and then a carriage returne In 
the interests of secrecy, your password will not printe After 
you have lLogsed ing the system will orint various messages and 
another “e". Type "enacCP>". This stands for "enable" and 
tells the system to allow you to perform ocperations denied the 
normal usere Once you have "enabled", or identified yourself to 
the system as a person with special powerSs the system will 
change its prompt to "!",. Now refuse automatic Logout by typing 
"ref ¢SP>acCR>"- AUGUST normally logs out users who leave their | 
terminals idlee 


8e% Errors and Fecoveries 
Usina an old monitor tape 


You may sometimes have to perform a tape recovery with an old 
monitor tapes for examples when you dc not have a copy of the 
current monitor or the current tane is bade ‘then this 
happense you can use the resident monitor from an old tape to 
start the system runnine Once the system is Ups you can 
switch to disk recovery to replace the old resident monitor 
with a good copy of the monitor taken from diske The 
procedure is as follows? 


1) Follow the tape recovery procedure from step 13 through 
254 If there has been @ powers probleme do step 3 
through 256 


2) then the system types "EDDT" on the operator’s 
terminals type "“dskrld¢cESC>q" to start a disk recoverye 


3) Fotlow the disk recovery procedure beaginning from step 
3e 


Problems reading the microcode tape 


If you cannot read the microcode tapes there is a hardware 
probleme Call Tymshare Maintenancee 
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Problems reading the monitor tape 


If you cannot read the monitor tapes the microcode may have 

been destroyede Start the recovery procedure over and begin 
at step 3 by Loadina the microcode.s. If you do not succeeds 

call Tymshare Maintenancee 


The interrupt message 


If you get a messages "Interrupt at. nnne"» where nnn is Sone 
numbers try reading both tapes agqaine If you are 
unsuccessful, call Tymshare Maintenanceée 


CHECKDISK is never run 


If the system hangs or crashes before CHECKDISK reports on 
the status of the file system and you never get the message 
"August in operation" or “August not in operation"» recovery 
will not be successfule Halt the systeme if necessarye and 
-try the recovery procedure over from step 3e Tf the complete 
procedure does not work on the third trye calt Tymshare 
Maintenancee 


CHECKDISK discovers problems with the file system 


If CHECKDISK finds anythino wrona with the file systems the 
System XXV will stop and wait for you to correct the 
problemse It cannot come up while somethina is wrong with 
the file systems the risk of destroyina files is too greate 
For directions on how to correct any oroblems CHECKDISK 
findse see section 13.429 Correcting Problems Found by 


CHECKDISKe 
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9 STANDALONE RECOVERY 
°.1 Introduction 


Bringing the system up with the standalone recovery procedure js 
useful when some érror in the disk» CHECKDISKe or the system 
jobs is causing the system to crash before it can come all the 
Way Upe During a standalone recoverys the system does not run 
CHECKDISK and the other checking programs that are part of the 
three "normal" recovery procedureSe Insteade the system comes 
up without checkina itself ands after it is ups ts shut to 
normal userss only the person at the operator’s terminal is 
allowed ine Standalone recovery fis riskye Oo not bring the 
system up with this procedure unless specifically instructed to 
dO SOc 


S22 Summary 


1) Check the SUGHLT number on the operator's terminal and look 
it up in the List of BUGHLTss in additions note which error 
Lights are Lite Record all this informatione 


2) Follow the procedure for tape recovery (section 8) from step 
13 through step 25e If you suspect there has been a power 
failures do step 3 through 25 of the tape recovery procedureée 


3) When the operator’*s terminal says "EDDT"» type "dbugsw/ "e 
The system will print either 0 (zero) or le 


4) Type "2<CR>". 
5) Type "start<ESC>oo".- The monitor tape should spine 


6) The system will request the date and timee Enter these as 
DD=-MON=YY<SPOHHSMM and follow with <CR2-5 


7) You will automatically be logged in as "system™» but not 
enadlede 


9.3 fiscussion 


To use standalone recovery procedures you beain by following the 
tape recovery proceduree But after the system reads the first 
part of the monitor tape and tells you it fs in EDDT»s you do NOT 
type “start¢<escd>qts to start the system running and read fn the 
rest of the tapee Insteade you work in ENDT an interactive 
Language for debucainas EDDT is part of the resicent monitor 
and is used to patch and otherwise manipulate ite Because it 
can change the monitors ECDT is a very powerful tool: use it 
With caree 
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Cnce the system tells you that you are jin ENDTs do not wait to 
be promptede EDDT has no prompt$ as you use it» it simply waits 
for you to type something and then reactsSe when you see that 
are in EDDT,» immediately type "dbugsw/"e This command has two 
partse The first parts "dbugsw", is the name of an addressSe 

The "/" means "print". Thuse "dbuasw/" instructs the system to 
print what it finds at the Location DBUGSNe This Location 
contains one of the system*s recovery switches» the denuaging 
switche The number at this address tells the system what it 
should do when it encounters a fatal errore A zero (0) at 
DEUGSW means the system should respond to errors by crashinge A 
1 means the system should take breakpointss that ise when a 
fatal error occurs the system should not crash but should stop 
where it iSe nreserve the context of the errors and orint out a 
RUGHLT addresse This address is what you record after a crash 
when you are instructed to record the BUSHLT numbere Knowtna 
the address of the error that caused the system to crash helps 
systems programmers find out what happenede Recovery switches 
are further explained in section 12¢ Recovery SwitchesSe 


After you print the current contents of DBUGSW, type "2<CR>O. 
This tetls the system to enter 2 at this location. ‘then OBUGSH 
is 2, it instructs the system to skip running CHECKDISK and the 
system jobs» and to come up "standalone". When a system cones 
up standaloney it accepts input only from the operator’s 
terminal: it does not allow any ordinary users to log ine 


Snee you have made sure the system will come up isolated from 
the outside worlds you start it by typing “start<cESC>a". The 
system will read the second part of the monitor tapee the part 
containing swappable monitore and ask you for the date and timee 
After you have entered these (as DOD-MON"“YYCSPOHHEMMCCRDO)¢ the 
System will come up and automatically tog you in as "system". 
This automatic login keeps the system from going through the 
complicated Login proceduree When you are Logged in as 
"system",s you have the same powers as if you had logged in as 
"operator"™$ remember to enable if you want to do anything 
requiring special powers. 


PT 
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10 MISK REBUILD STRATEGY OR TOTAL CATASTROPHE 


Introduction 


Scecasionallye a particularly deadly system crash destroys the 
file system. If you suspect this has hapoenede immediately 
notify your manager andy if possfblee an CAD operating systems 
programmers do not attempt to do moreée 


When the file system is destroyede you. must use the various 
dumps made each week to rebuild the disk and restore the files 


as 


completely as possible to their pre-crash stateée This must 


be finished before the system needs anything from the diske 
Rebuilding the disk is a fairly simple procedures but the Loss 


of 


users® files and the passibility that they may be damaged or 


incompletely restored is so serious that you should NEVER 
undertake a disk rebuild without specific instructions and 
assistance of a manacer or an operating systems programmere 
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Summary 


WARNING: Never attempt this without specific instructions from 
a manager or an operating systems programmere 


1) 


2) 


3) 


4) 


5?) 


6) 


7) 


Check with Tymshare Maintenance to make sure the hardware is 
Goode 


Follow the tape recovery procedure from steps 13 through 25-6 
If there has been a power taitluree do steps 1 through 256 


When the operator*ts terminal says "ENDDT"™, type "dbugsw/". 
The system will print either zero (0) or one (1). 


Type "2<CRO*", 
Type “syslod<ESCog"e 


The system will aske "Do you really want to clobber the disk 
by reinitializina?"*. 


Type "yc<cCR>". This stands for “yes".e- Do not type more than 
Ye 


The system will say» "OK»e You asked for fiteee" 


The system will reinitialize all the files and then reporte 
"No EXEC". 


10) Load the PLUSER tape on the tape drivee 
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11) Type "L"_ for "Load". When the system asks» "Load from 
magtanpe MTANS*~, tyne "mta0s<CR>"™ ("O" here is zero) and 
confirm with another <CR>-. 

12) When the system askS» "File Number?",_, type "O<CR>". 


13) The system will now read the DLUSER fil2e the first part of 
DLUSER tanee 


14) When it has finishedsy the system will prompt you with a 
period (ede the prompt for MINI@WEXECe At the peritode type 


"Sete 


15) The system will orinte ®"Interrupt at nnn"» where nnn is some 
numbers followed by a periode 


16) To read the second file in the tapes the JUMPER filer type 
"L™,. for "load". when the system askse "Load from magtape 
MTANS"™,— type "mtadzs<CRO"_¢ and confirm with another <CR>-. 

17) When the system asks,» "File Number?",_, type "1<CRD". 

18) The system will now read the OUMPER file. 

19) When the system prompts you with a periods type "See 

20) The program DOUMPER is now Loaded and ready to start 
restoring the filese Mount the first Full Dump Tapee Make 


sure you load the Full Dump Tapes in numerical ordere 


21) SUMPER will now ask a series of questionss preceding each of 
them with instructionse 


22) To answer the first questions "“DUMP» LOADs CHECKs OR 
SINGLE?"_ type "L"_ for "Load". 


23) For the second questions "00 YOU WISH TS SUPERSEDE OLDER 
VERSIONS ALWAYS?"%, type "n",_ for "no". 


243 When PUMPER askse "SPECIFIC USERS7%,— type "nn". 
25) when it asksqe "INTO SAME DIRECTORIES?» tyne "y". 


26) Finally» when requesteds "TYPE MAG TAPE UNIT NUMBER» type 
"OC" (7erode 


27) DUMPER will now read the tapes; when it is finisheds It will 
printe "MOUNT NEXT TAPE«s IF ANYe TYPE Co WHEN READYo Neo IF 
NO MORT". Mount the next tape and type "c".o 
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228) DUMPER will again ask for the mac tape unit number$ type 
"of. 


29) Continue mounting and Loading Fult Dump TapeS, typing "c*® to 
continue and then aiving 0 Czero) for maa tape unit numbers 
until all the tapes have been reade 


30} When you have finished Loading the Full Dump TapeS»s begin 
Loading the Incremental Dump Tapese Ye sure that you load 
the Incremental Dump Tepes fin chronological ordere beginning 
with the one made right after the Full Dump and ending with 
the most recente 


31) After the system has read the Last Incremental Dump Tapes 
when PUMPER saysSe "MOUNT NEXT TAPE» IF ANYe TYPE Co WHEN 
READY» Ne IF NO MORE", tyoe "n". 


32) DUMPTR will stop and the system will print an interrupt 
message followed by a pertod, the promot for MINI-EXECe 


33) The files have now been restored as completely as possiblee 
Halt the systems and begin a disk recoverye (To halt the 
system use Method A of section 1359 Halting the Systeme) 


10.3 Discussfon 


A disk rebuild is necessary when a crash destroys the filese : 
Since a system that has tost its files cannot be expected to run 
CHECKDISK or the system Jobs successfully -- these are stored on 

the disk and everything on the disk has been lost «= the system 
must be brought uo in such a way that it does not need anything 
from its files; in facte it does not even realize they are lLoste 
This means the system must be brought up standalones since in a 
standalone recovery the system skips running the system jobs, 
does not check the file system with CHECKDISK » and does not 
open itself for normal usee 


Rut bringing up the system for a disk rebuild fis not a 
completely "normal" exanple of brinaing the system up 
standalonee After you have set DRUESW to 2 (to make the system 
come up without checking itself and closed to users)e you do not 
then type "start<ESCog" to start the system runninge Instead 
you type "syslod<ESC>g"e- This stands for "system Load". It 
tells the system to begin running a program that wipes out all 
existing files and then allows you rebuild the entire fittle 
system with files conied from tapee 


When you give the Systod commande the system wilt aske "Do you 
really want to clobber the file system?". When you respond "y", 
for "yes®, it will prints "NKe you asked for iteee" and 
reinitialize all the filese When the files have been 
reinitializede the system will state: "No EXEC". EXEC 
disappears because it was stored on the diske After informing 
you of EXEC*s disappearances the system will prompt you with a 
period (e)¢-e the prompt for MINI-EXECe MINI“EXEC is a group of 
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basic commands that are toaded with the monitore MINI@-EXEC has 
two important features: Firsts MINI-EXEC recognizes commands by 
their first letters So you type "L" for "Load", "s" for "start", 
and so on$ and seconds in MINI@“-EXEC you must end all simple 
commands that do not ask you for further information with a 
period for confirmatione 


NOTES Because MINI-EXEC recoanizes commands by their first 
Letters if you make a mistake in giving a simple commande type a 
few more random characters before you confirm with a period. 

The additional Letters you type will make the command 
unrecognizablee When you are again prompted with a periods 
repeat the command you wante If you are giving a command that 
asks you for further information before it is executedse type 
some random characters in answer to the additional question. 
This will cause the command to be aborted and the period will 
reappeare 


fnee you are in MINI“-EXECs you can beqin the procedure for 
rebuildinc the disk from backup tapese Mount the DLUSER tape on 
tape drive zeroes and type "L"_5 for "load". then the system asks 
yous "Load from magtane MTAN?"9 identify your drive as "mta0n:e" 
and confirm by typing "<CRO<CR>"®". The system will now ask which 
file on the tape it should read by printing: "File Number?",. 
The number of the DLUSER files which should be printed on the 
tape casings is zero (0).e Enter this and follow #t with a 
carriage returne 


DLUSER stands for "dunp and load users"e The DLUSER file 
contains data about all the directories on the systeme both the 
user and system directoriteS, and a program that can use this 
information to rebuild theme When you type "Se"s you instruct 
the system to run this programe 


When the system has rebuilt the directories, it will print an 
interrupt messagee This means it 4s ready to read another filee 
You now want to load the file containina the DUMPER orograme To 
do this againe type "LL", and then againg when askede "Load from 
mMactape MTANS",_, identify your drive as "mta0:"% and confirm by 
typing "<CRO<CRSO"®. Nexte you will be asked for the file numbere 
The file number for the DUMPER file fis one €1)3 this atso should 
be printed on the tape casinge Enter 1 and follow it with a 
confirmina <CR>e The SUMPER file contains a program able to 
read files from tape and restore them to the correct 
directorieSe Once you have Loaded DUMPER and typed "se" to 
start its the DUMPER oroaram will start runninge You can now 
use this program to rebuild the disk by mounting and readina 
back into the system the dump tapes that contain the back-up 
versions of all the files on the diske 
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To ascertain what it should doe DUMPER will ask you 2 series of 
questionse Each question is preceded by an explanation of how 
you should answer ite These exolanations are desianec fer 
people usina DUMPER for routine maintenance of the file systeme 
Mo not be alarmed if the answers you are instructed to give here 
do not agree with what the system tells you to do. Cisk rebuild 
4s not a normal situatione 


The first thing OUMPER will want to know is what you plan to doe 
To find outs MUMPER will ask "DUMP» LOAD, CHECKe OR SINGLE?", 
Since you want to load files from tape back tnto the disk type 
PL", for "Load®".e Then mount the first Full Oump Tane on the 
tape drivee Full Dump Tapes are tapes made at reaular 
tntervals, usually weeklye that contain a record of the entire 
contents of the diske The first Full Dump Tape contains all the 
information the system needs to rune AS you mount this and the 
following tapes on the tape drivee make sure they do not have 
write ringSe 


DUMPER now will try to find out what to do with the information 
on the tapee It will first ask "D9 YOU WANT TO SUPERSEDE OLDER 
VERSIONS ALWAYS?"%. Answer with a "n"_» for "no". This makes 
sure that CUMPER witl put the files from tape anc the files 
already on the disk in the correct order and pay attention to 
version numberse ODUMPER will then ask "SPECIFIC USERS?", 
PUMPER asks this because it normally restores the files of 
single users whose directorfes are somehow Lost or mutilated. 
Since you want to restore the all the files of every user» type 
"n", for "no" $ ande when DUMPER wants to knows "INTO SAME 
DIRECTORIES?" , type "y"~- DUMPER®s final request will bet "TYPE 
MEG TAPE UNIT NUMBER®™. After you type "O0",_— DUMPER will copy the 
files from the currently mounted tape into their directories. 
When it has finishede it will prints "MOUNT NEXT TAPS e IF ANYS 
TYPE C WHEN READYe Ne IF NO MORE®,. Load the next full dump 
tapes making sure it does not have a write ringe and type "ce 
for "continue". When the mag tape unit number is requesteds 
answer with "0". This new tape will then be reade and DUMPER 
will again ask if you want to go one Continue Loading the Full 
Dump Tapes until all have been reade 


vhen you have finished lLoadina the Full Dump Tapese ft is time 
to Load the Incremental Oump Tapese Incremental Cump Tapes are 
tapes made every night that contain only files altered during 
the preceding daye As you enter the Incremental Dump Taoese you 
progressively update the files entered from the Full Dump Tapese 
Fnter the Incremental Dump Tapes in chronolosical orders 
beginninea with the taove made riaht after the full dumpe and 
ending with the tape made most recentlye Use the same procedure 
you used to load the Full Dump Tapes: mount the tanee type "cq 
and enter the unit numbere When you have Loaded the finals ic@e 
the most recente Incremental Dump Tapes you will have restored 
the files as well as they can be restorede At this pointe 
answer "n",» for noe to DUMPER*s question about any further 

~tapese DUMPER then will halts and the system wilt print an 
interrupt message followed by a periode the prompt for 
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MINI|-EXECe Now halt the system (with Method & of section 1345,» 
Haltina the System) and bring it up with a disk recovery. 


10e4% Errors and Recoveries 
Tnability to read the ODOLUSER tape 


If you are unable to read the DLUSER tapes halt the system 
and start the entire orocedure over againe 


Tnability to read the first Full Dump Tape (the tape after the 
DLUSER tane) 


Since the PLUSER tape contains a copy of BYUMPEP s,s once you 
have read this tapes DUMPER 4s stored on the diske If you 
then camnot read the second tapes that ise the first Full 
Dump Tapee type <CTRL=P>e (You may have to do this several 
timese) You will get a periods the prompt for MINIW“EXECe 
After you have the period, halt the system and bring it up as 
documented itn the section "Standalone Yecovery". khen the 
system is ups you can run DUMPER from disk by typing 
"dumper<CR>®™ at the EXEC "a", Once DUMPER Is runnings start 
from step 20 in the procedure documented abdovee If you still 
cannot read the first Full Dump Tapes halt the system and 
start the whole process again from step 1. 
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11 RECCVERY FROM MEMORY PARITY ERRORS 
l1el Introduction 


System XXVs run on odd paritye This means that for every word 
of memory, the sum of the bits turned on plus the parity bit 
must be odde The system checks the parity whenever ft uses 
stored informatione If it finds a word with even paritys a 
parity error occurse If the system discovers a parity errore it 
first tries to correct the error itselfe If the error cannot be 
corrected, then the system automatically scans coree prints an 
error message Listing the Locations and contents of the 
offending addresseSe and stops with a BUGHLTe Tymshare 
“aintenance must be called for atl System XXV parity errors, as 
they indicate that the memory hardware may be bad. 


11-2 Summary 


Call Tymshare Maintenance for all parity errorse 
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12 RECOVERY SWITCHES 
12e1 Introduction 


The System XXV has four "recovery switches" that tell it how to 
respond to system errors and what to do when it crasheSe These 
switches are actually four Locations tin the system*s central 
memorys each controlling a particular aspect of the system’s 
responsee ivery location or switch can have at Least two 
different valuese Changing the values, changes what the system 
Wilk do in the particular situation that the switch controlse 
For examples the switch controlling what the system does when it 
encounters a RUGCHKs a Less serious error than a RUGHLTe can be 
set to 0 (zero) or 1 fConede When the switch has the value Oe 
the system ignores BUGCHKSs when it is set to le» the system 
crashes when it encounters a RUGCHKe Thus» to make the system 
run as you wishe you simply set each switch to the aporopriate 
valuee The rest of this section wilt help you discover what 
this value may be and teach you how to set ite The first parte 
"Switches and Their Settings®~, discusses each of the four 
switches and what they controls, and describes the effects of 
their different settingse The second parts "How to Change 
Switch Settings™, explains how to set a switch to have the value 
you wante 


12e2 Switches and Their Settings 
The System XXV*s four recovery switches are? | 
DPUGSWs which controls resonse to a BUGHLT 
DCHKSWe which controls the response tao BUSCHK 


COMPSWe which tells the system whether or not to take a core 
dump 


RELDSWs which tells the system whether or not to actually 
hecin automatic recovery 


The system checks DCHKSW when it encounters a BUGCHKe a 
relatively minor type of errore The system then tmmediately 
does as this switch instructs its; no other switches are looked 
ate When the system encounters a BUGHLTe a fatal errors it 
checks DRUGSWe OBUGSW nay then tell it to check the two 
remaining switchese COMPSW and RELDSWe. If DBUGSW does not 
instruct the system to Look at COMPSW and RELDSWe they are never 
checkede 


The table below outlines the values each recovery switch can 
heve and the effect of setting the switch to this -valuee The 
first cotumn cives the name of the switche the second cotumn 
Lists the possible values for this switchs and the third column 
describes how the system will act when the switch has this 
value. The "normal" value for each switch is narked with a 
stars (*). then all switches have their normal values», the 
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system prints messages at BUSCHKs¢e and crashes at BUGHLTS. 

After the system crasheS-e it will wait for an operator to bring 

#t ube If you want the system to recover automaticallye rather 
than wait for operator*s instructionss simply change the setting 
of DBUGSW from 1 to 0. See the next section for instructions. 


Switch Value "ffects 
(*=normal ) 


ee See Se Se SO SO TCHS STB FS SCS FF SF FF SS SSP FT OVW SVa sOoQneoS Seen eee TS*q ee 2a @e 


NRUGSE ct Stop at RUSHLTSe check CDM>SW and ®&FLOSW and 
do what they saye (For Automatic Secovery) 


l* Stop at BUGHLTSe dontft check CDMPSW and RELDOSws 
go into E0DTe and wait for an operator to 
begin recoverye (For Disk or Tape Recovery) 


2 Stop at BUSGHLTse dont check CDMPSW and RELDSWy 
go tnto ENDTe Oonft run system-checking prograns 
and come up shute (For Standalone Recovery) 


DCHK SW O* Don*t stop at RUGCHKSs print error message and 
continuee 


1 “Stoo at BUGCHKse print error messagee go into 
EN0T.» and wait for an operator to hegin recoverye. 


COMP SW 0 Non*t take core dump before beginning recoverye 
le Take core dump before beginning recoverye 
RELDSW 0 Don*t begin automatic recovery after crashinge 


1* Begin automatic recovery after crashinage 


12e3 Switch Descriptions 


CBUGSW: OBUGSWe Located at memory location 76s is the switch 
the system checks when it encounters a SUSGHLT while runninae A 
BUGHLT is a serious system errore The system must crash when a 
BUGHLT occurs; this switch determines what the system does after 
the crashe DBUGSW can be set to Ge 1s or 2e A O at DBUGSHW is 
the setting for automatic recoverye It instructs the system to 
check the switches CD¥PSW and RELDSW and do as they Save COMPSW 
wilt tell it whether a core dump should be taken: RELOSW will 
tell the system whether to actually start the recoverye (See 
below and the section on automatic recovery for detailsed Al 
at DBUGSW is the standard settinae With this settings upon 
encountering a BUGHLTs the system stops where it iS9 prints out 
a BUGHLT messages agoes into EDDTs and waits for instructions 
from the operator’*s terminale You can then hbeain whatever 
recovery norocedure is aporopriatee A 2 at DBUGSH has the same 
effect as a le ande in additione after a recovery procedure jis 
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starteds causes the system to come up standalone. The system 
will come up without running CHECKDISK and the system fobs andg 
after it is upe only the person at the operator's terminal is 
allowed accesSSe (To do a standatone recoverys you put a ? in 
DBUGSW before the system comes all the way upe?) 


QOCHKSWS DCHKSH, Located at memory address 77-3 fs the only 
switch checked when the system encounters a BUSCHK$ no other 
switches are consultede OGCHKSW can be set to 0 or Ie When 
DCHKSW is Og the system will print out. a BUGCHK messace and then 
continue runninge When OCHKSW ts 15 the system will print out a 
BUGCHK message and then stops, go into EDDTs.s and wait for further 
instructions about how to come upe Since BUGCHKsS are not 
serious errorss the usual setting for DCHKSW is (QQ. 


CDOMPSW: CDMPSW is Located at memory address 100-6 It is checked 
only when a PUGHLT occurs and the system finds that DPUSSW is 
set to zero, the setting for automatic recoverye CD“PSW tells 
the system whether or not to make a copy of the contents of 
central. memory before beginnina to come upe The process of 
copying the contents ot memory is called "takina a core dump". 
If COMPS switch is set to O- then the system will not take a 
core dump. It will simply check RELDSW to seo Jf it really 
should come up automaticallye If COMPSW fs set to 1s before 
checking RELOSWe the system will copy system first 512 paces of 
memory into a file called <SYSTEM>COPDMPeLOW and the second 512 
pages into a file called <SYSTEM>CORDMPeHGHe Since systems 
programmers may need to took at the contents of the memory t35 
investiaoate the crashes COMPSW 4s generally set to le 


RELOSW2 RELDSW is Located at memory address 101.6 Ite Like 
COMPSWe is checked only after a BUGHLT occurs and DBUGSW set to 
zeroy the setting for automatic recoverye RELDSWY tells the 
system whether or not it shoutd actually begin this automatic 
recoverye AO in RELDSW tells the system not to recover 
automatically: the systen will then wait for instructions from 
the operator*s terminal just as if DRUGSW were set to le Al 
tells the system "yese do begin to come up automatically". 
Recause when DBUGSW is 0 you usually do want the system to 
recover automaticatly, the normal settina of RELDSW is 1. 


1264 How to Change Switch Settings 
Introduction 


This section tells you how to change the values of the System 
XXV*%s four recovery switcheSe Ta do this the system must be 
running correctly and you must be able to enables To learn 
the names of the recovery switchess their values» and what 
they meane see the previcus sectione 
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Summary 


1) Check your promnpte If it is an exclamation mark (!)¢ you 
are enabled. If it is note type “enacCR>" at the FXEC 


POM. 


2) At the "ff" promots type "mddt<CR>", 


3) Tyne "switchname/"_, where switchname stands for the name 
of the switch you want to chance. 


4) The system will cative you the current value cf the switche 


5) Type *"switchvalue<CR>"e where switchvatue stands for the 
new value the switch should haves O¢ le or 2e 


6) To check the new values tyne "switchname/*® againe The 
system should show you the new valuee 


7) Type "<CTRL=-C>"$ you should return to FXEC and get the "I" 
prompte 


Discussion 


To change the values of the recovery switchess you need to 
work in MDT an tnteractive tLanguage for debugginae It fs 
part of the resident mcnitor and is used to change and 
manipulate ite To enter MDDTe you first need to make sure 
you are enablede Check your prompts if it fs an exclamation 
mark (!)¢6 you are enabled. If it is anything elses type 
"ena<c<Ck>" at EXEC "a" prompte After you are sure you are 
enableds enter MDDT by typing "mddt<CR>". The system will 
print "mddt", to show you have enterede and then do nothing 
morée Like EDDT, MODT has no heralds as you use its it 
simply waits for you to tell it something and then reacts. 


When you are in MODTe to go to the switch you want to change 
and Look at its current values type the switch name followed 
by a slash (/)_ for examples "dchksw/".e This command has two 
partse The first parte "dchksw"_ ts the name of the address 
that contains the recovery switch valuée The "/" means 
"print". Thus~e "dchksw/" instructs the system to show you 
the contents of the Location "dchksw"e 


Once MPROCT has shown you the vatue of the switche it waits at 
this location to see it you want to do anything elsee If you 
decide you do not want to change this switche simoly type a | 
carriage returne To enter a2 different value in this address, 
type the value you want followed by a carriase returne The 
number you tyne will immediately become the new value of the 
switche To make sure that you entered the velue you wantede 
again type the switch name -followed by a slashe If the value 
js correcte simply type a carriage returne This means you 
are finished working with this addresse If it is not 
corrects type the correct value and then a carriase return. 
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After you have changed as many of the four recovery switches 
as you wishe you are ready to teave MODTe To do this» type 
€CTRL=-C> and you witl returned to EXEC. 
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13) RELSTED PROCEDURES 
13e1 Introduction 
This section describes the following procedures. 
Correcting Problems Found by CHECKDISKe section 13.2 
Running CHECKDISK Yourself from FXECe section 1323 
Deleting and "xpunaging Filess Section 134 
Haltina the Systeme section 1365 
Changing the Date and Times section 1346 


Connecting to and fisconnecting from the Micronode TYMBASEs 
section 13.7 


When a recovery process requires that one of these procedures be 
usede you will be referred hereée If you find that you never 
have to use any of thems do not be alarmede This is a sian of 
succesSSe These procedures are used only when something goes 
wrong s= whens for examplee CHECKDISK finds file problems that 
must be correcteds you need to halt the systems or the systen 
some how comes up with the wrong date and time. 


13e2 Correcting Problems Found by CHECKDISK 
Introduction 


CHECKDISK is a proaram the system uses to check the file 
system before it comes all the way up and opens itself to 
userse If it finds any problemse the system stateSs "Auoust 
not in operation" and stops to waits for them to de corrected 
with the procedure documented belowe Once this is done» halt 
the system as documented Later in "Related Procedures", and 
then bring it up again with disk recoverye This section 
deats only with recovery from file errors detected by 
CHECKDISKe It assumes that CHECKDISK has been run 
automatically. To learn how to run CHECKDISK manuallye see 
13e39 Running CHECKDISK Yourself from EXEC. 


Summary 


CHECKDISK checks the files for Itlegal Disk Addresses (IDAS)¢ 
Multiple Disk Addresses (MDAS) and B¥t Table Frrors (B8TES)- 
If it finds any of theses it Lists the files tnvolved and 
their errorse To correct the problems found by CHECKDISK do 
the followings 


1) If only one file has errorss delete and expunge that file. 
Be sure to type the entire file names including all 
extensions; do not use <ESC> to fill out namese The 
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process of deleting and expunaing files is described in 
section 13649 Neleting and fxpunging Filese 


2) If more than one file is tinvolveds delete and expunae all 
files with IDAse 900 not delete the files with DAs at 
this pointe 


3) Halt the system with Method A of section 13.5¢ Halting the 
Systeme Then bring it up again with disk recoverye' If 
CHECKDISK acain finds files with. ID4se repeat this 
procedureée 


4) Once no files have IDAse if one or more files have MI4S-e 
delete and expunge the file with the largest number of 
MDAss halt the system, and bring tt up with disk recoverye 
Do this three times. -If you then still have more than 20 
files with MDAS+s call an operating systems proarammere 


NOTES Keep a list of the files you delete and exounges and 
restore them after the system comes upe Always send 
messages to all users whose files have been deleted and 
restorede 


Discussion 


CHECKDISK can detect three types of errors: Sit Table Errors 
CR8TES)« Illegal Disk Addresses (IDAS)_y and Multiple Disk 
Addresses (MDAs de CHECKDISK can correct ETEs without 
assistancee It cannote however, correct TDJAS or YDASe These 
two errors are what are known as Page Table Errorse They 
occur when the system*s file maps stored in what is called a 
"page table", is incorrecte AUGUST memory is divided into 
units called paqgess each consistino of 512 wordse File 
storage is allotted by pagess and one page is the smallest 
unit of storage that can be transferred from disk to cores A 
page table is Like a table of contents for the disk storagee 
For each files it records the addresses of all the pages 
allocated to that filee} An IDA means there is a disk 
address that is garbagee An MOA means the system has 
assigned the same part of the disk to two or more filese If 
these errors are atlowed to go uncorrected, they can destroy 
the file systeme 


The remedy for problems detected by CHECKDISK is to delete 
the files that really do have bad storage addressese If 
there is only one file with bad addresses-e there is not a 
serious problems; simply delete that filee If more than one 
file is afflictede begin by deleting all files with IDASe 
ID&4s are a freauent cause of MDAs. OIftene when the system 
follows an IDA-e Ft will find other things that it can 
interpret as more addresseS, but which are note ‘hese phony 
addresses may duolicate the real addresses of pages belonging 
to other filese thus causing MDAse After ID4s are taken care 
Of, files with MDAS may remaine Neleting the single file 
with the most MDAs may take care of the probleme 
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Once you have deleted the appropriate filese you should halt 
the system and brings it back up with disk recoverye If 
CHECKDISK again finds errorse you must again correct them, 
bring the system downe and then back upe If the fourth time 
CHECKDISK Ys run it stitl finds errorss notify an CAD systems 
proecrammere Remember that once the system does come up 
successfullys the owners of the files must be notified about 
alt files deleted. 


Running CHECKDISK Yourself from EXEC 


Introduction 


CHECKDISK is a program that checks the address system and 
page allocation of the diske CHECKDISK usually runs 
automatically as the system comes UPpe However, there may be 
occasions, for example after a standalone recoverys when you 
need to run CHECKDISK yourselfe This sectfon documents that 
procedur@ése What CHECKDISK does is explained in section 1342¢ 
Correcting Errors Found by CHECKDTSKe 


Summary 


1) At the EXEC "G"_, tyne "<system>ocheckdisk<ESCOCCRO*®. 


2) When CHECKOISK askse "Do you want to run in multiole fork 
mode?",_ type "Y¥"_ for yes". ALL answers to CHECKDISK®s 
question must be capvitalizede Da not type more than a 
single Letters since CHECKDISK will take any excess 
Letters as answers to following questionse 


32 When CHECKOISK askSe "Do you want to run backwards?7"»9 type 
MPN", for NOe / 


4)» To the cuestion: “Rebuild the bit tablte?"_5 type "N*, 
5+ To the ouestion: "Scan for disk addresses?", type "N", 
6) CHECKDISK will now check the disk for bad filese For 


instructions on how to deal with bad filess see section 
l3e2e 


Discussjon 


You invoke CHECKDISK by typina "<system>checkdisk<ESCOCCRO". 
Once CHECKDISK is Loadedy it will aesk you a series of 
questions to determine how the disk should pe checked and how 
much information adout its status you want to get and storee 
When CHECKOISK runs automaticallye these options are already 
specified; howevere when you run CHECKDISK manuallye you must 
specify them yourself. 


The first question CHECKDISK will ask ise "D0 you want to run. 
in multiple fork moce?".- This meanse "Do you want to fire up. 
a different fork of EXEC to run CHECKDISK senarately for each 
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disk?". The standard answer here is "Y"»y for "yes"_ since 
runnina CHECKOISK simultaneously on all the disks is faster 
than goina through the disks one at a timee Note that you 
should type only the first letter of your answers to 
CHECKDISK*s questions and that this tetter must be 
canpitalizede This iS importante CHECKDISK cannot recognize 
Lowercase lLetterse Moreavers if you type more than one 
Letters CHECKDISK will read the second and fotlowina letters 
as answers to Later questionse This can cause a lect of 
problemSe . 


Once CHECKGISK knows how many forks you wants it will ask if 
you want it to run backwards and check the disk from the Last 
file to the first. The standard answer here is "N", for 
*no™. CHECKDISK will then ask» "Rebuild the bit table?". 
Acaing answer "f'", The bit table fs used to keep track of 
which pages on the disk have been used and which are freee 
Howeveres the bit table is not updated after every process 
that frees pages in the diske When you delete a bad fileée 
for instances the bit table will st#tl mark as taken the 
pages that you have freede Thuse it is a good idea to 
rebuild the bit table occasionally: otherwises the whole disk 
could eventually be marked as takene when parts of it were 
actually free. But rebuilding the bit table fs too time 
consuming & process to do when you are bringing the system up 
from a presumably unscheduled craShe 


The CHECKDISK will then ask if it shoutd scan for disk 
addresses. CHcCKDISK wants to know if you want the names of 
the files that are actually associated with all the bad disk 
addressese Since this information is useful only to systems 
programmerSs answer "NN". 


CHECKDISK will now check the disk and print a List of bad 
files and their errorse For instructions on how to deat with 
bad files» see section 13.29 Correcting Problems Found by 
CHECKDISKe 


Deleting and fxpunging Files 


Summary 


1) If your prompt is not an "!"» type "enacCR>" at the EXEC 
MO" 


2) Connect to the directory that contains the file by typing 
"cd<SPodirectoryname<CR >", where directoryname stands fer 
the name of the directory you neede 


3) Type "del<SPofilename<CR>",. Make sure you type the entire 
file name includina extensionse Po not use <ESC> to Fill 
out the name <= the file may not be recognized correctlye 
Procede all unusual characters in the file names for 
example @e with <CTRL=-V>. 
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4) Tf the system tellLs you the file is perpetuate type 
*"not<SP>opern<SP>tilename"™ and then delete the filee 


5) Type "exp<ESCocCrR>*. 


6) Remember to connect back to directory "oper"™ when you 
finish deleting files by typing "cd<SP>oper<CR>"e 


NOTES Always send a message to any user whose fites you have 
deletece 


Halting the System 


Introduction 


There are two ways of halting the System XXV both halt the 
system immediately and for no designated length of time. 

They are used when you have encountered some problem during a 
crash recovery and want to brina the system down so that you 
can start again in the normal waye Method A #s designed to 
halt a system that ts running and will respond to commands 
given from the operator’*s terminale This is probably the 
procedure you witl most often usee You would use Method Ay 
for example, to halt the system after fixing file problems 
found by CHECKDISKe Whenever Method &£ does not work because 
the system fs hung or for some reason does not respond to the 
operator’*s terminal» you should resort to Method Be After 
haltina the system with either of these methods,» you may use 
whatever recovery procedure seems appropriate bring it back 
UD e 


Methed Ae Halting a Running System from the Operator's Terminal 


This procedure has two stepSe Firsts you need to cet into 
MINI-EXECs and then you need to halt the systeme If you are 
already in MINIe-EX¥EC when you decide to halt the systeme 
start this procedure on step 453 if you are note start at step 
le The way to tell if you are in MINI-EXEC is to look at the 
prompte If it is a period (ede you are fin MINI|-EXECS FF it 
is anything els@€e you are note 


1) If your prompt is not an "fF", type “"enacCR>"* at the 
©XEec "a", 


2) Type *"quit<cCR>" 


3) When the system askSe "Do you really want to go into 
AUGUST monitor? (Confirm)"_, type "<CR>". 


5) At the period (ee) prompte type "h*. 


6% The system will echos "HALT TENEX"%e 
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7) Type "e%.e 


8) The system will halt. 


Method Be Halting the System from the Control Panel 


1) 


2) 


3) 


4) 


a? 


6) 


7) 


Put adcress switch 31 on Cupde 
Put data switch 2 one 

Put CONSOLE DEPOSIT THIS one 
Put data switch 2 offe 

Put data switch 0 one 

Put CONSOLE DEPOSIT THIS one 


When activity (the flickering of the lLischtse etce) stops3 
the system has haltede 


13.6 Chancing the Date and Time 
Summary 
1) If your prompt is not an "!"%_—5 type "enacltR>"™ at the EXECS 
Mam» 
2} When you see the prompt "!", type 
"CCTRL-E>set CSPODI-MONW-YYCSPOHHEMMCCRO"§$ that is» two 
numbers for the days a dashe the first three Letters of 
the months a dashe and then two numbers for the yeare 
Follow this with a spacey then aive the time on 24 hour 
basis*e and end with <CR> You must type the entire date 
and time to reset any part of ite 
3) Type a confirming <CR>. 
4) At the EXEC "8", tyre "day<CR>" to check the new date and 
timee 
13-7 Connectins to and Niscennecting from the Micronode TYMBASE 


Introduction 


The two sets of procedures documented below allow you control 


whether or not the system will communicate the micronode 
TYMBASE.} The ability to control the system*s interaction 
with the micronode is useful when some micronode error is 
caustna system problems or when the micronode +s down and 
the system should not try to connect to ite Each set of 
procedures allows you to do the same thinas: Turn the 
micronode connection offs which causes the system to 
ignore the micronode$ and turn the micronode connection 
ony which tetlks the system to synchronize with the 


13 


Retated Procedures Page 52 


micronodee Method A and Method B differ in where you give 
the controling commandse Method A useS commands given in 
EXECe In Method 38e in the other hands you use EFEDDT. 
Method 8 should be used only during crash recoverys when 
you must control how the system interacts with the 
micronode as it comes upe In all other casese control 
interaction from the EXEC with Method &e 


"“ethod A: Controlina interaction from FXEC 


To turn off the micronode connection 


1) At the EXEC "@"—_ tyne "<CTRL=EStymnet <SPsoff<CR>". 


To turn on the micronode connection 


1) At the EXEC "@"%_ type "<CTRL=E>tymnet<Sdso0ncCR>*. 


Method 8: Controling interaction from ENDT 


To turn off the micronode connection 


1) In ENDT>s type "tymflaq/". 


2) After the system prints a values type "O<CR>". 


To turn on the micronode connection 


1) In ESDTe type "tymfla/". 


2) After the system prints a valuee type "=-1¢€CR>", 
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APPENDIX 

This section is designed for quick references use tt when you need 
to took up a certain step in a procedures cannot remember 
exactly what order to do thingss and so forthe No explanations 
of when to use these procedures, discussions of what they doe or 
suggestions about what to do if things qe wrong ere included 
heree For this type of information go to the first part of this 
manual where the procedures outlined in this section are 
Giscussed jn Ggreater Lengthe ALL sections references in this 
appendix are also to earlter sections in this document. 

WHAT TO DO IF THE SYSTEM IS HUNG 
1) Put address switch 31 on (€up)e 
2) Put data switch 2 one 
3) Put CONSOLE DEPOSIT THIS momentarily one 
4) Put data switch 0 one 


5? Put CONSOLE DEPOSIT THIS momentarily one 


6) Wait until activity (the flickerina of the Lightss etc.) 
stopSe 


7) Bring the system up with the disk recovery proceduré€e 


DISK RECOVERY 
1) Record the BUGFHLT number and error lichtse 


2) Type “dskrld<ESC>a*. The reponse should be "reloading from 
disk". If you never get this messages begin a tape recoverye 


3) when the system sayse "BOOT FROM DISK PACK # CCR FOR ANYI]™¢ 
type <CR>. 


4) When the operator*s terminal says "EDDT"» type "“start¢cESC>oa". 
5) If CHECKDISK runs successfullys the system will announce 
*Lugust in operation® and ask for the date and timee Enter 


these in the form DD=MCN#YY<SPOHHSMM and follow with <CR>- 


6) At the @ prompts Log in by typing “"“oper<SPospassword<cSPro<cCR>"¢ 
Where password stands for your passworde 
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7) Type "enacCR>". 


8) Type "ref <SP>acCRo"e« 


AUTOMATIC RECOVEFY 
If a System XXV set for automatte recovery comes up 
successfully» you do not need to do anything until you Log fn as 
an operatore : 


1) At the 3 prompts, log in by typing "oper<SP>password¢<SPoCcCRo*, 
‘where oassword stands for your passworde 


2) Type "enacCR>"- 


2) Type "ref<SPdsacCRO"e. 


TAPE RECOVERY 

12} Record the BUGHLT number and error lightse 

2) If the power has aone offs you must reload the microcode as 
documented in steps 3 through 126 If the power has not gone 
offs skip to step 13. 

3) Mount the microcode tape on the tape drivee 

4) Put all switches on the control panel off (down)de 

5) Put address switch 32 on Cupre 

6) Put MICRO PROCESSOR STOP one 

7) Put MICRO PROCESSOR MIPC one 

8) Put MICRO PROCESSOR CLR momentarily one 

9) Put MICRO PROCESSOR CONT momentarily one 

10) Put MICRO PROCESSOR MIPC off 

11) Put MICRO PROCESSOR STOP offe 

12) Put MICRO PROCESSOR CONT momentarily one The tane should 


soin and then stope Remove the microcode tape from the tape 
drivee ; 
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13) You are now ready to read in the new monitore Mount the 
monitor tape on the tane drivee CStart here if you do not 
want to Load the microcode. J 

14) Put all switches on the control panel offe 

15) Put address switches 24 and 26 one 

16) Put MICRO PROCESSGR STOP one 

17) Put MICRO PROCESSOR MIPC one 

18) Put MICRD PROCESSOR CLR momentarily one 

193 Put MICRO PROCESSOR CONT momentarily one 

29) Put MICRG PROCESSOR MIPC offe 

21) Put MICRG PROCESSOR STOP offe 


22) Momentarily put MICRO PROCESSOR CONT one The tane should 
spin and then stope 


23) Put address switches 24 and 26 offe 
24) Put address switches 29 and 30 one 
25) Momentaritly put CONSOLE START on twicee 


26) When the cperator*s terminal says "ENDT%—5 tyne 
"start<ESCog"™. 


27) Femove the monitor tape from the tape drivee 


22) After the system reports the size of the memorys put MI PAR 
ERR STOP and MEM PAR ERR STOP One 


29) If CHECKDOYSK runs successfullye the system will announce 
"August in operation" and ask for the date and timee Enter 
these in the form DD="ON-YY<CSPOHHEMM and follow with <CR>- 

30) At the @ prompte Log in by typing 
"opercSP>password<SPr><CRo"9 where password stands for your 
passworde , 

31) Type "enacCR>". 


32) Type "ref<SP>acCR>*"e 
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STANDALONE RECOVERY 


1) 


2) 


Record the B8UGHLT number and error Liahtse 

Follow the procedure for tape recovery (sectfon 8) from step 
13 through step 25. If you suspect there has been a power 
failures do step 3 through 25 of the tape recovery procedureée 
When the operator's terminal says "EDDT"» type "dbugsw/". 
Type "2¢CR>"- 

Type "start¢<ESCog". The monitor tape should spine 


The system will reguest the date and timee Enter these in 
the form DD=MON=“YY<CSPOHHSMM and follow with <CR>e 


You will automatically be logged in as "system"s but not 
enablecde 


DISK REBUILD STRATEGY CR TOTAL CATASTROPHE 


WARNING? Never attempt this without specific instructions from 
a manager or an operating systems proarammere 


1) 


2) 


3} 


4) 


5) 


6) 


7) 


8) 


Check with Tymshare Maintenance to make sure the hardware {s 
goode 


Follow the tape recovery procedure from steps 13 through 25-¢ 
If there has been a power failure, do steps 3 through 25-64 


When the operator*s terminal says "EDOT"» type "dbugsw/". 
Type "2<CR>". 
Type "sysLlod<ESCoa%e 


When the system askSe "No you really want to clobber the disk 
by reinitializing?"s type "y<CRO"*. 


Load the MLUSER tape on the tape drivee 
Type "UL", for "Load". When the system askSe "Load from 
maatape MTANS"», tyne "*"mtaOs<CR>" ("0" here fs zero)de Confirm 


this with another <CR>. 


When the system askSq "File Number?", type "O<CCRD>". 
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10) when the system has read the DLUSER file and prompts you 
with a period Cede tyo® "Sere 


11) when you see "Interrupt at nnn"e where nnn Is some numbere 
type "Lt", for "Load", then specify "mta02<CR>"*_, and confirm 
with another <CRD-e 


12) when the system asks» "File Number?",_ type "1<CR>", 
13) Lt the perfod prompts type "Sere 


14) Mount the first Full Dump Tapee Make sure you Load the Full 
Dump Tapes in numerical ordere 


15) To DUMPER®s questions "DUMPe LOADes CHECKe OR SINGLE?"¢e tyve 
"L"™, for "Load". 


16) For the second questions "D0 YOU WISH TO SUPERSEDE OLDER 
VERSIONS ALWAYS ?",_ type "nn". 


17) For the questione "SPECIFIC USERS?"%_, type "n". 
18) To answers "INTO SAME DIRECTORIES?"9 type "y". 


19) When requesteds "TYPE MAG TAPE UNIT NUMBER®,- tyne "Q* 
(zero)e 


20) CUMPER will now read the tape} when it is finishede it wilt 
printe "MOUNT NEXT TAPE IF ANYe TYPE Co WHEN READYo Neo IF 
NO MORE“. Mb6unt the next tape and type "c". 


21) When DUMPER asks for the mag tape unit number$ type "0", 


22) Continue mounting and Loading Full Dump Tapess tyoing "c" to 
continue and then giving 0 (zero) for mag tape unit numbere 
until all the tapes have been reacde 


23) When you have finished lLoadina the Full Dump Tapese begin 
Loading the Incremental Dump Tapese S8e sure that you toad 
the Incremental Dump Tapes in chronological orders beainning 
with the one made right after the Full Dump and ending with 
the most recente 


24) After the system has read the Last Incremental Dump Tapes 
when DUMPER saySe "MOUNT NEXT TAPE» IF ANYe TYPE Se WHEN 
READYe Ne IF NO MORE*®¢> type "*n"e 


25) when you see an interrupt message followed by a periods halt 
the systeme and begin a disk recoverye. (To halt the system 
use Method A of section 13.5s Halting the Systeme? 
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CHANGING RECOVERY SWITCH SETTINGS 
1) If your prompt is not an "fF", type "enacCR>"* at the EXEC "a". 
2) At the *"!" prompts type "mddt<CR>". 


3) Type "switchname/"» where switchname stands for the name of 
the switch you want to charnagee 


4) Type "switchvaluecCR>"_5 where switehvalue stands for the new 
value the switch should haves Oe¢« le or 2e 


5) Type "®"<CTRL=C>" to return to EXEC. 


CORRECTING PROBLEMS FOUND BY CHECKDISK 


1) If onty one file has errorss delete and exonunge that filee 
Be sure to type the entire file name. 


2) If more than one file is tinvolvede delete and expunge all 
files with IDAs. Do not delete the files with MOAS at this 
pointe 


3) Halt the system with Method A of sectton 13-5 and bring it up 
with disk recoverye If CHECKDISK again finds files with IDAs, 
repeat this procedureée 


4) Once no files have IDASs if one or more files have MDAS» 
delete and expunge the file with the Largest number of MDAs» 
halt the systeme and brina it un with disk recoverye Do this 
three timese If you then still have more than 20 files with 
MDAss call an operating systems programmere 


NOTE? Keep alist of the files you delete and expunaes and 
restore them after the system comes upe AlwWaysS send messages to 
att users whose files have been deteted and restorede 

RUNNING CHECKDISK YSURSELF FROM EXEC 
1) At the EXEC "Aa", type "<system>checkdisk<cESCOCCRD",. 
2) When CHECKDISK askse "Do you want to run in multiple fork 
mode?", type "Y", for yes*e ALL answers to CHECKDISK*s question 


must be capitalized and one Lettere 


3) To the questions "Do you want to run backwards?", type "Nite 
for NOe 
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4) To the question: "Rebuild the bit table?7"5 type "N", 

5) To the question: "Scan for disk addresses?7"_, type "N", 

6) CHECKDISK will check the disk for bad filese For 

instructions on how to deal with bad ffilese see section 1342. 
PELETING ANO EXPUNGING FILES 

1) It your prompt is not an "!", type "ena<CR>" at the EXEC "4", 

2) Connect to the directory that contains the file by typing 

"cd¢eSProdirectoryname<CR>"_ where directoryname stands for the 

name of the directory you neede 

3) Type "del<SPofilename<CR>". Make sure you type the entire 

file name including extenstonse Do not use <ESC> to FILL out 

the name -= the file may not be recoanized correctly. Procede 

all unusual characters in the file names for example @e with 


CCTRL “Ve 


4) If the system tells you the file is perpetuale type 
*"not<SPoperp<SP>filename"® and then delete the filee 


5) Type "exp CESCOCCR>"» 

6) When you finish deleting files, type "cd¢SP>oper<CF>", 

NOTES Always send a message to any user whose files you have 

deleted. 

HALTING THE SYSTEM 

Method Ae Halting a Running System from the Operator’*s Terminal 
This procedure has two stepSe Firste you need to get into 
MINI=-°XECe and then you need to hatt the systeme If you are 
already in MINI=-EXEC when you decide to halt the systeme 
start this procedure on step 43 if you are not, start at step 
le The way to tell if you are in MINI-EXEC is to look at the 
prompte If it is a perfod (ee), you are in MINI@EXEC$ if it 


is anything elses you are note 


1) Tf your prompt is not an "!F"%,_, type "enacCRO" at the 
EXEC "at, 


2) Type ®"quit<CR>* 


3) wKhen the system askSe "D0 you really want to go into 
AUGUST monitor? (Confirm) "_, type "CCR", 


5) At the period (2) promote type "h". 
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6? The system will echoe "HALT TENEX*",. 
7) Type "ee 


&) The system will halte 


Method &. Haltina the System from the Control Panel 


1) Put address switch 31 on (Cup). 


2} Put data switch 2 one 


3) Put CONSOLES DEPOSIT THIS One 


4) Put data switch 0 one 


5S) Put CONSOLE DEPOSIT THIS one 


6) When activity (the flickering of the Lights,» etce) stops; 


the system has haltede 


CHANGING THE DATE AND TIME 


1) 


2) 


4) 


If your prompt is not an "F"_ type “ena<cR>o" at the EXEC "3", 


When you See the prompt "!f", type 
PCCTRL=E>set<cSPSDD=-MONWYYCSPOHH SMMCCRO"5 that ise two numbers 
for the days a dash» the first three letters of the months a 
dashe and then two numbers for the yeare Follow this with a 
space and then give the time on 24 hour basise You must type 
the entire date and time to reset any part of ite 


Type a confirming <CR>. . 


At the EXEC "Ff", tyne "day<CR>" to check the new date and 
timee 


CONNECTING TO AND DISCONNECTING FROM MICRONODE TYMBASE 


Method Az Controling interaction from ©XEC 


To turn off the micronode connection 
1)? At the EXEC "a"~_ tyne "<CCTRL-EStymnet<SPosoff<CRo". 
To turn on the micronode connection 


1) At the EXEC "@"_ type "<CTRL=E>tymnet<SP >0n<CRo*e 
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Method B: Controlina interaction from EDDT 
To turn off the micronode connection 
1)? In EDDTs type “tymflg/"e 
2) After the system prints a valuee type "0<¢CRD>", 
To turn on the micronode connection 
1) In ENDTs type "tymfla/". 


2) After the system prints a value» type "=“1¢CRD", 


